Refont: Difference between revisions
m →Downloads: Revert update to sample annotation file for English/European Tag: Manual revert |
→The Annotation File: Revised for choice of 3 english annotations |
||
Line 117: | Line 117: | ||
== The Annotation File == | == The Annotation File == | ||
When told to generate a REF file (but not FNT), refont.exe also looks for a file called "refont_char_annotations.txt", in the same folder as refont.exe. That file should have exactly 256 data lines in it, one for each of the codepoints, in order. As the REF is generated, the line that starts a codepoint's data block gets the corresponding annotation line appended to it. | When told to generate a REF file (but not FNT), refont.exe also looks for a file called "refont_char_annotations.txt", in the same folder as refont.exe. That file should have exactly 256 data lines in it, one for each of the codepoints, in order. As the REF is generated, the line that starts a codepoint's data block gets the corresponding annotation line appended to it. Having the annotations come from an external file, instead of hard-coded into refont, allows required flexibility as to what information is presented. It even potentially allows refont to be used with other idTech3/4 games that don't necessarily used the TDM-specific codepoints. Also, the annotation files contain helpful specific comments (as lines that start with ";"). | ||
Each annotation: | |||
* indicates what character symbol is ''expected''; it is oblivious as to what character if any is actually within the bounding box described by the DAT data. | |||
* lives only in a REF file, and is not retained when converted to a DAT file. This also true for any manual supplementary annotations you might make. | |||
This is clearer with examples. In 'Downloads' below are 4 different versions of "refont_char_annotations[...].txt". Three of these, specific to TDM's custom 'english' (i.e., composite European) codepage, are discussed here. To deploy, edit the filename of one of them to remove the square bracket text. | |||
=== English with Symbol Only === | |||
This simple format is best for most purposes. The file entry for letter 'A' is just: | |||
A | |||
Then the block in a generated REF file would have that string appended, but with " // " inserted. | Then the block in a generated REF file would have that string appended, but with " // " inserted. | ||
... | ... | ||
char 65 (0x41) // | char 65 (0x41) // A | ||
{ | { | ||
height 18 | height 18 | ||
Line 146: | Line 151: | ||
... | ... | ||
Here are a few more interesting examples from that REF: | === English with Unicode-16 Codes and Names === | ||
This supplies the Unicode Consortium's official 16-bit codepoints (where U+NNNN = 0xNNNN) and their names, in the formal all-caps convention. | |||
The file entry for letter 'A' is: | |||
U+0041 LATIN CAPITAL LETTER A | |||
Then the first line of the corresponding block in a generated REF file would show: | |||
... | |||
char 65 (0x41) // U+0041 LATIN CAPITAL LETTER A | |||
... | |||
Here's another example from that REF: | |||
... | |||
char 129 (0x81) // U+015A LATIN CAPITAL LETTER S WITH ACUTE | |||
... | |||
For control or TDM undefined characters, this annotation file assumes a hollow box glyph is the appropriate mapping: | |||
... | |||
char 0 (0x00) // U+24A1 WHITE SQUARE | |||
... | |||
But there are other choices, so edit the annotation file as you see fit. | |||
=== English with 8859-x Source === | |||
This verbose format may be useful during font analysis. For letters that are in their expected locations as defined by ASCII or ISO 8859-1 (Latin-1) or Windows-1252 ("Western Europe"), the annotation is fairly minimal. Thus, letter 'A' has this: | |||
0x41 is A | |||
(TIP: If authoring your own annotation file, consider including a codepoint number on each line, like "0x41" here, to make the process easier.) | |||
Then the first line of the corresponding block in a generated REF file would show: | |||
... | |||
char 65 (0x41) // 0x41 is A | |||
... | |||
Here are a few more interesting examples from that REF, with more expansive annotations: | |||
... | ... | ||
char 5 (0x05) // 0x05 is a control character, unusable in TDM | char 5 (0x05) // 0x05 is a control character, unusable in TDM | ||
Line 152: | Line 190: | ||
char 166 (0xa6) // 0xA6 is Š (not ISO 8859-1 ¦) (from A9 in ISO 8859-2) (TDM only) | char 166 (0xa6) // 0xA6 is Š (not ISO 8859-1 ¦) (from A9 in ISO 8859-2) (TDM only) | ||
... | ... | ||
== Revising or Making Your Own Annotations File == | == Revising or Making Your Own Annotations File == |
Revision as of 19:44, 13 April 2024
By Geep, 2024, and similar to the article about Q3Font.
As of 2024, the new "refont" command-line utility allows inspection and alteration of font metrics. It is designed to be an easier-to-use version of the inspection/alteration parts of Q3Font. It also provides a font analysis feature.
Comparison with Font Patcher
Refont can be seen as complementary to Font Patcher. The latter has additional features that may be better if planning a wholesale rearrangement or expansion of characters within a bitmap. But Font Patcher requires installation of Perl, while refont.exe is more drop and go.
Comparison with Q3Font
For metric alteration, q3font has a native "*.fnt" human-readable format, and supports DAT <==> FNT conversions. For compatability, so does refont.
Advantages of Q3Font
- This has been an important traditional tool for Doom3/TDM font manipulation.
- In addition to production/consumption of FNT, it -
- Allows creation of DAT & TGA files from a TrueType font (although for this purpose, ExportFontToDoom3 has been preferred in practice).
- Can write (but not read) an alternative "compact report".
Advantages of Refont
- It's open-source, so further C++ development is possible.
- It has its own native human-readable format, "*.ref", similar to .fnt but better.
The better features of REF are:
- Codepoints are enumerated in not just decimal, but also hex.
- Coordinates of the overall image box within its bitmap are expressed in pixel units, rather than [0...1] fractional floating point. So the user is spared having to manually do this tedious and error-prone calculation.
- if optionally provided, a separate, read-only annotation file will automatically decorate the .ref file with helpful information as the file is generated. This is explained below.
Because the code is open-source, there's better clarity about -
- how parsing and calculations are done.
- what warnings are emitted.
Also, using "-stats", refont can write (but not read) an analysis of a DAT file, particularly looking for problematic/unimplemented Western-language characters.
For Inspection & Correction of DAT files
To Get the Data in Readable, Editable Form
For viewing and altering in a text editor, two formats are provided. Both list every codepoint, in order from 0 to 255, as a block of data. Within each block, every item gets its own line.
As a FNT File
To create a q3font-compatable "<dat-name>.fnt" file:
refont -decompile <path to given .dat file> -fnt
where the path can be a full path, a relative path (e.g., starting with "./" or "../"), or just the file name in the current working directory. The resulting .fnt file will appear in the same directory. It have the same name as the .dat file, except for the extension. As a convenience to keep track of versions, it is not required that the file name be in the standard fontImage_nn.dat form.
Example code block for the letter A:
... char 65 { height 18 top 18 bottom 0 pitch 3 xSkip 18 imageWidth 20 imageHeight 18 s 0.769531 t 0.250000 s2 0.847656 t2 0.320313 glyph 0 shaderName fonts/stone_0_24.tga } ...
As a REF File
To create a refont-specific "<dat-name>.ref" file:
refont -decompile <path to given .dat file>
The same path and naming conventions as for .fnt generation discussed above apply here.
Example code block for the letter A when no supplemental annotation file is present:
... char 65 (0x41) { height 18 top 18 bottom 0 pitch 3 xSkip 18 imageWidth 20 imageHeight 18 coord_s 197 coord_t 64 coord_s2 217 coord_t2 82 glyph 0 shaderName fonts/stone_0_24.tga } ...
Items bolded differ from FNT format. The starting line expresses the value in hex. And rather than s, t, s2, t2 in fractional [0...1] floating point, the "coord_..." values are expressed (and editable) in pixel units.
When Editing
Refont has a minimal parser. To keep it happy, preserve the file's line structure. In particular:
- every REF and FNT file has the same number of lines.
- all keywords within a block must be present, and in a particular order.
- each block must have exactly the same number of lines (except for the special one at the end).
The meaning of the font metrics are explained elsewhere (LINKS COMING SOON). The Q3Font article has an example of simple metric editing that is also pertinent to refont.
To Move Changed FNT or REF Data Back to a DAT
q3font -compile <path to given .fnt or .ref file>
This reverses the -decompile process, to create a .dat file whose name is derived from the .fnt or .ref file. Importantly, this decompile/compile cycle leaves .tga/.dds files untouched.
The Annotation File
When told to generate a REF file (but not FNT), refont.exe also looks for a file called "refont_char_annotations.txt", in the same folder as refont.exe. That file should have exactly 256 data lines in it, one for each of the codepoints, in order. As the REF is generated, the line that starts a codepoint's data block gets the corresponding annotation line appended to it. Having the annotations come from an external file, instead of hard-coded into refont, allows required flexibility as to what information is presented. It even potentially allows refont to be used with other idTech3/4 games that don't necessarily used the TDM-specific codepoints. Also, the annotation files contain helpful specific comments (as lines that start with ";").
Each annotation:
- indicates what character symbol is expected; it is oblivious as to what character if any is actually within the bounding box described by the DAT data.
- lives only in a REF file, and is not retained when converted to a DAT file. This also true for any manual supplementary annotations you might make.
This is clearer with examples. In 'Downloads' below are 4 different versions of "refont_char_annotations[...].txt". Three of these, specific to TDM's custom 'english' (i.e., composite European) codepage, are discussed here. To deploy, edit the filename of one of them to remove the square bracket text.
English with Symbol Only
This simple format is best for most purposes. The file entry for letter 'A' is just:
A
Then the block in a generated REF file would have that string appended, but with " // " inserted.
... char 65 (0x41) // A { height 18 top 18 bottom 0 pitch 3 xSkip 18 imageWidth 20 imageHeight 18 coord_s 197 coord_t 64 coord_s2 217 coord_t2 82 glyph 0 shaderName fonts/stone_0_24.tga } ...
English with Unicode-16 Codes and Names
This supplies the Unicode Consortium's official 16-bit codepoints (where U+NNNN = 0xNNNN) and their names, in the formal all-caps convention.
The file entry for letter 'A' is:
U+0041 LATIN CAPITAL LETTER A
Then the first line of the corresponding block in a generated REF file would show:
... char 65 (0x41) // U+0041 LATIN CAPITAL LETTER A ...
Here's another example from that REF:
... char 129 (0x81) // U+015A LATIN CAPITAL LETTER S WITH ACUTE ...
For control or TDM undefined characters, this annotation file assumes a hollow box glyph is the appropriate mapping:
... char 0 (0x00) // U+24A1 WHITE SQUARE ...
But there are other choices, so edit the annotation file as you see fit.
English with 8859-x Source
This verbose format may be useful during font analysis. For letters that are in their expected locations as defined by ASCII or ISO 8859-1 (Latin-1) or Windows-1252 ("Western Europe"), the annotation is fairly minimal. Thus, letter 'A' has this:
0x41 is A
(TIP: If authoring your own annotation file, consider including a codepoint number on each line, like "0x41" here, to make the process easier.)
Then the first line of the corresponding block in a generated REF file would show:
... char 65 (0x41) // 0x41 is A ...
Here are a few more interesting examples from that REF, with more expansive annotations:
... char 5 (0x05) // 0x05 is a control character, unusable in TDM ... char 166 (0xa6) // 0xA6 is Š (not ISO 8859-1 ¦) (from A9 in ISO 8859-2) (TDM only) ...
Revising or Making Your Own Annotations File
If creating your own refont_char_annotations.txt file, note that encoding upper-range characters in UTF-8 will be more universal than using Windows region-specific codepages. To enter a UTF-8 character using its four digit hex Unicode (say, in a range of interest 0x0080-0x00FF):
- Under Windows, type the four hex digits followed by Alt-X.
- Under Linux, hold down Ctrl+Shift, type U followed by the 4 digits; release Ctrl+Shift.
When refont reads refont_char_annotations.txt, any line that starts with ";" will be ignored. The latter is so that you can put comments into the file, that will not propagate to the generated REF file.
Other than that, be sure to have exactly 256 lines of annotations. A line can be empty, i.e., have only the linebreak.
When generating a REF file, refont will automatically add " // " before each non-empty annotation, so don't include that.
Adding Your Own Annotations to a REF File
Beyond the refont_char_annotation.txt content, you can add certain commentary to the REF file without it affecting its ability to convert to a DAT file. Here are the rules:
- Don't add or subtract lines, or comment out non-blank lines
- Feel free to append a comment to the end of a line, by beginning the comment with " //".
- Likewise, on a line starting with "char" that had an annotation automatically added, you can revise the annotation as you see fit.
Example with revision in bold:
... char 65 (0x41) // Capital A was slightly clipped. Decremented s, s2 by 1. Looks OK now. { height 18 top 18 bottom 0 pitch 3 xSkip 18 imageWidth 20 imageHeight 18 coord_s 196 // WAS: 197 coord_t 64 coord_s2 216 // WAS: 217 coord_t2 82 glyph 0 shaderName fonts/stone_0_24.tga } ...
In this way, you can markup a REF file as a record of work to do, work in progress, or work completed.
Errors and Warning
During the compile or decompile, errors and warnings may be generated in the console. Fatal errors are largely due to:
- file i/o problems
- deviations from the strict DAT, REF, and FNT formats.
Warnings, which are not fatal, at this time are largely due to anomolous numeric values. The TDM engine can tolerant some of these, so they don't necessarily need fixing. The warnings could in theory be further improved in the future; see comments in the source code for ideas.
For Statistics on Unimplemented or Problematic Font Characters
DAT file analysis here is primarily designed to benefit english/european fonts. It will use the optional annotation file if provided. Syntax:
refont -stats <path to given .dat file>
Use "-stats" by itself, not combined with "-decompile".
The resulting analysis report will have the same name and location as the DAT file, but with a ".txt" suffix. The report provides categorized totals and (for problematic characters) itemizations across all 256 character codepoints. The counts of problems should be considered minimums, since no inspection of bitmap files (TGA or DSS) is done. But a perhaps surprising amount of insight can be gleaned just from DAT file metrics. Itemizations include annotations from the 'ref_char_annotation' file, if provided.
In the hunt for missing characters, 3 signatures are important:
- Presumed 'Hollow Box' = glyph's 'shadername' is <fontname>_0_<size>.dds only, and s & t are zero, but not s2 & t2
- '<Space>' = same test as 'Hollow Box', but pointed to by char 32 (0x20)
- 'Zero Box' = No glyph box (s, t, s2, t2 all zero)
A given analysis will assume either 'Hollow Box' or '<Space>' but not both.
The analysis provides totals and itemizations - grouped into lower (0-127) and upper ranges (128-255) - in 3 passes:
- Pass 1 - Handling of unprintable/unsupported/missing codepoints, indicated by Hollow Box, Zero Box, or <Space>. Some of these are not a problem, but some are tagged as "Undesirable".
- Pass 2 - Bad glyph box (negative s, t, s2, or t2; or s2 <= s, t2 <= t) or good glyph box with dubious metrics (imageHeight <= 0, imageWidth <=0, imageHeight != height). Excludes those already counted as "Undesirable" in Pass 1.
- Pass 3 - Detection of duplicate glyph boxes (other than Hollow Box, Zero Box, or <Space>). Detected by: glyph's values for shadername, s, t, s2, & t2 exactly match those of another codepoint. These are grouped into "Dup Sets".
Finally, across all passes, a count if given of the minimum "Total Glyphs Needing Work".
Downloads
Release 2 of April 7, 2024:
- refont.exe
- sample annotation file for English/European, recommended
- sample annotation for Russian, recommended. To use, edit filename to remove "[russian]" substring.
- source code: refont.cpp
For More
See the summary analysis of 2.12 TDM fonts, based on applying 'refont -stats ...' to all 'english' DAT files.