Multilanguage Display: Difference between revisions

Latest revision as of 17:44, 19 June 2026

Background

This describes a particular challenging use case of a TDM methodology that has been around for some time. The motivation to detail this was a 2026 update by Geep to the language map files, following a revision of Carleton 24pt font used throughout the Main Menu.

In TDM's main menu, the Settings/Video/Language page attempts to list all the supported languages in their native form with respect to character set, that is, untranslated. Showing multiple language strings together is difficult, but some tricks are available. The result has been a reasonable near-term workaround for this particular page, applicable to comparable cases. (Comprehensive support for multilanguage display likely involves a major restructuring, for instance, moving to a native Unicode architecture with combined Latin and Cyrillic bitmaps.)

Recall that strings for the main menu are found in the utf8 file tdm_base01.pk4/strings/all.lang, where there are sections for [English], [German], etc. A particular section is used to generate for distribution the corresponding file, e.g., french.lang, german.lang, etc., in a language-specific 8-bit encoding. When you play TDM, and select a given language, the corresponding .lang file (if provided) is read.

Except for Cyrillic Russian, these files use encodings within the ISO-8859 family (as detailed in I18N_-_Charset). Any specific encoding is a bottleneck... by design, not all characters in any one member of the ISO-8859 family will be accessible by some other members.

Language Names in Idealized UTF8 Form

Only the TDM-exposed languages are shown here. The all.lang file has additional strings for other potential Latin and Cyrillic languages. For reference, the generated encoding for the <language>.lang file is shown in the comment.

	"#str_02460"	"English"		// English	[ISO-8859-1]
	"#str_02461"	"Deutsch"		// German	[ISO-8859-1]
	"#str_02462"	"Español"		// Spanish	[ISO-8859-1]
	"#str_02463"	"Français"		// French	[ISO-8859-15]
	"#str_02464"	"Português"		// Portuguese	[ISO-8859-1]
	"#str_02465"	"Polski"			// Polish	[ISO-8859-2]
	"#str_02466"	"Italiano"		// Italian	[ISO-8859-1]
	"#str_02467"	"Česky"			// Czech	[ISO-8859-2]
	"#str_02468"	"Русский"		// Russian	[WIN-1251]
	"#str_02469"	"Català"		// Catalan	[ISO-8859-1]
	"#str_02470"	"Dansk"			// Danish	[ISO-8859-1]
	"#str_02472"	"Nederlands"		// Dutch	[ISO-8859-1]
	"#str_02474"	"Magyar"		// Hungarian	[ISO-8859-2]
	"#str_02476"	"Svenska"		// Swedish	[ISO-8859-1]
	"#str_02477"	"Türkçe"		// Turkish	[ISO-8859-9]
	"#str_02479"	"Română"		// Romanian	[ISO-8859-16]
	"#str_02480"	"Slovenčina"		// Slovak	[ISO-8859-2]

First Compromise – Cyrillic vs. Latin Font

While there is some overlap in characters between Cyrillic and Latin encodings, it is insufficient for our purposes. So, for [English] and other European sections, a Latin transliteration from Cyrillic is used:

	"#str_02468"	"Russkiy"		// Russian

Going the other way, for [Russian], the current treatment is not to do a Cyrillic transliteration from Latin, but instead to drop accents, rendering just in ASCII (except of course “Russian”), e.g.:

	"#str_02460"	"English"		// English
	"#str_02461"	"Deutsch"		// German
	"#str_02462"	"Espanol"		// Spanish
	"#str_02463"	"Francais"		// French
	"#str_02464"	"Portugues"		// Portuguese
	"#str_02465"	"Polski"			// Polish
	"#str_02466"	"Italiano"		// Italian
	"#str_02467"	"Cesky"			// Czech
	"#str_02468"	"Русский"		// Russian
	"#str_02469"	"Catala"		// Catalan
	"#str_02470"	"Dansk"			// Danish
	"#str_02472"	"Nederlands"		// Dutch
	"#str_02474"	"Magyar"		// Hungarian
	"#str_02476"	"Svenska"		// Swedish
	"#str_02477"	"Turkce"		// Turkish
	"#str_02479"	"Romana"		// Romanian
	"#str_02480"	"Slovencina"		// Slovak

Moving between ISO-8859 Members

In the discussion here, let “L1” be the current language (from the player’s perspective) or the [language] section of interest in all.lang (from the translator’s perspective). Let “L2” be the language of the foreign word, i.e., L1 and L2 differ.

Type and Go

In several cases, the translator merely puts the ideal character in the all.lang strings. After <language>.lang generation, everything works. Cases are:

L1 and L2 are both use the same ISO-8859 encoding. (Specifically, they are either both ISO-8859-1, which doesn’t need <language>.map files, or both ISO-8859-2, and have <language>.map files with, by convention, identical contents.) NOTE: For TDM’s purposes, ISO-8859-15 (French) can be considered part of the ISO-8859-1 family.
L2 character is ASCII (i.e., in ISO range 0x00-0x7f). This is always the case if L2 = English. Characters in this range have identical codepoints in all five TDM-supported ISO-8859 and in the TDM target encoding.
L2 character is in the ISO range 0xA0-0xFF, with additional constraints.

For the last case, the L2 character must be either:

present at the same codepoint in all five TDM-supported ISO-8859 encodings, and in the TDM target encoding. For our Latin language names, these accented characters are like that:

		ç in "Français" and "Türkçe" at 0xE7
		à in "Català" at 0xEA
		â in "Română" at 0xE2
		ü in "Türkçe" at 0xFC

present at the same codepoint in ISO-8859 encodings for L1 and L2, and in the TDM target encoding.
present in different codepoints in ISO-8859 encodings for L1 and L2, but with each being either the same as the TDM target encoding, or mapped to the TDM target encoding by the <language>.map file.

Tricks

If the L2 character is not represented in the ISO encoding for L1, then at <L1>.lang generation time, it will be replaced by a “?”. To work around this, the translator has two methods:

Trick Method 1 – Direct Stuffing

When L1 encoding is ISO-8859-1 (or -9 or -15, which all have identical encodings for the 0xA0-BF ranges) and L2 character is in TDM’s 0xA0-0xBF range (or D7, F7), then you can lookup the ISO character associated with that codepoint (shown in parentheses in cells of the i18n- charmap table), and type it. Examples:

[English] // and other ISO-8859-1, -9, or -15
	"#str_02467"	"¬esky"		// our default mapping is ISO 8859-1, so ¬ is shown as Č (¬)
	"#str_02480"	"Sloven®ina"		// Slovak (® in ISO-8859-1 is č in our font)

Here’s another example of that last stuffing from a TDM-unexposed language, showing also a diaresis (0xA8):

	"#str_02481"	"Sloven¨®ina"		// Slovenian (southern slovenia) (¨ => š)

In this case, ISO-8859-1 would need to use the diaresis, but ISO-8859-15 (French) would not, i.e., could type and go with š. The spreadsheet shows all available substitutions in light blue.

Trick Method 2 – Special Mapping with Repurposed Character

If all else fails, a little-used character in L1 can be redirected to the L2 codepoint TDM wants, by an extra mapping command in the <L1>.map file. Current such tricks are...

To get ñ for "Español":

for iso 2, put ¨ (diaeresis 0xA8)
for iso 16, put ¶ (pilcrow 0xB6)

To get ê for "Português":

for iso 2: Put ´ (acute accent 0xBD)

TDM & ISO Char Sets - Details, Potential Tricks, Limitations

The main table below (after the Key) was drafted by Google AI, from a prompt that defined the header and an example row, and specified:

"Rows are a combination of all the printable 8-bit codepoints in the range 0x80-0xff defined in ISO-8859-1, -2, -9, -15, and -16. The rows are ordered by Unicode number. If a particular ISO standard does not include the character, leave that cell blank."

Subsequently, after Excel import, the TDM and Comments columns were filled in by Geep, and color coding added. Language mappings were independently color coded, then verified against existing language maps. Finally converted to wikitable format.

This table skips codepoints in the range 0x00-0x7f, because they are the same for all ISO-8859 standards. No mapping required.

Color Key
Cell	Meaning
Light Gray Text Row	Symbol not included in TDM custom character set (so TDM column blank). Can use for char substitution tricks.
0xNN on white	Type desired utf8 char (in Symbol column) in all.lang. Generation goes directly to TDM codepoint given.
0xNN, amber hilite	Type desired utf8 char in all.lang. Generation + mapping (in <language>.map file) goes to TDM codepoint via codepoint shown.
0xNN in gray	Type desired utf8 char in all.lang. Generation + mapping (in <language>.map file) goes to TDM codepoint OF SUBSTITUTE CHARACTER via codepoint shown.
0xNN in red	Character not part of this ISO. "0xnn" shown is not ISO, but for trick mapping (in <language.map), to support main menu's multilingual "Languages" page.
Substitute Char	Character not part of this ISO. But as trick, you can stuff in this substitute char at the needed TDM codepoint.
n/a	Not available to use substitute character, due to trick mapping (in <language>.map) elsewhere.
Empty Cell	In ISO column, means character not part of this ISO (so may be candidate for future trick mapping; or simply not fully analyzed viz substitute char.)
Tan Row	Orphaned character in TDM set, not part of 5 ISO-8859 standards for current TDM-supported languages.

Unicode Coverage for TDM
Unicode	Symbol	TDM	1	2	9	15	16	Unicode Name	Comments
U+00A0		0xA0	0xA0	0xA0	0xA0	0xA0	0xA0	no-break space
U+00A1	¡		0xA1		0xA1	0xA1		inverted exclamation mark
U+00A2	¢		0xA2		0xA2	0xA2		cent sign
U+00A3	£		0xA3		0xA3	0xA3		pound sign
U+00A4	¤		0xA4	0xA4	0xA4			currency sign
U+00A5	¥		0xA5		0xA5	0xA5		yen sign
U+00A6	¦		0xA6		0xA6			broken bar
U+00A7	§	0xA7	0xA7	0xA7	0xA7	0xA7	0xA7	section sign
U+00A8	¨		0xA8	0xA8	0xA8			diaeresis	See also hack use for ñ
U+00A9	©		0xA9		0xA9	0xA9	0xA9	copyright sign
U+00AA	ª		0xAA		0xAA	0xAA		feminine ordinal indicator
U+00AB	«		0xAB		0xAB	0xAB	0xAB	left-pointing double angle quotation mark
U+00AC	¬		0xAC		0xAC	0xAC		not sign
U+00AD		0xAD	0xAD	0xAD	0xAD	0xAD	0xAD	soft hyphen
U+00AE	®		0xAE		0xAE	0xAE		registered sign
U+00AF	¯		0xAF		0xAF	0xAF		macron
U+00B0	°		0xB0	0xB0	0xB0	0xB0	0xB0	degree sign
U+00B1	±		0xB1		0xB1	0xB1	0xB1	plus-minus sign
U+00B2	²		0xB2		0xB2	0xB2		superscript two
U+00B3	³		0xB3		0xB3	0xB3		superscript three
U+00B4	´		0xB4	0xB4	0xB4			acute accent
U+00B5	µ		0xB5		0xB5	0xB5		micro sign
U+00B6	¶		0xB6		0xB6	0xB6	0xB6	pilcrow sign
U+00B7	·		0xB7		0xB7	0xB7	0xB7	middle dot
U+00B8	¸		0xB8	0xB8	0xB8			cedilla
U+00B9	¹		0xB9		0xB9	0xB9		superscript one
U+00BA	º		0xBA		0xBA	0xBA		masculine ordinal indicator
U+00BB	»		0xBB		0xBB	0xBB	0xBB	right-pointing double angle quotation mark
U+00BC	¼		0xBC		0xBC			vulgar fraction one quarter
U+00BD	½		0xBD		0xBD			vulgar fraction one half
U+00BE	¾		0xBE		0xBE			vulgar fraction three quarters
U+00BF	¿	0xBF	0xBF		0xBF	0xBF		inverted question mark
U+00C0	À	0xC0	0xC0		0xC0	0xC0	0xC0	latin capital letter a with grave
U+00C1	Á	0xC1	0xC1	0xC1	0xC1	0xC1	0xC1	latin capital letter a with acute
U+00C2	Â	0xC2	0xC2	0xC2	0xC2	0xC2	0xC2	latin capital letter a with circumflex
U+00C3	Ã	0xC3	0xC3		0xC3	0xC3		latin capital letter a with tilde
U+00C4	Ä	0xC4	0xC4	0xC4	0xC4	0xC4	0xC4	latin capital letter a with diaeresis
U+00C5	Å	0xC5	0xC5		0xC5	0xC5		latin capital letter a with ring above
U+00C6	Æ	0xC6	0xC6		0xC6	0xC6	0xC6	latin capital letter ae
U+00C7	Ç	0xC7	0xC7	0xC7	0xC7	0xC7	0xC7	latin capital letter c with cedilla
U+00C8	È	0xC8	0xC8		0xC8	0xC8	0xC8	latin capital letter e with grave
U+00C9	É	0xC9	0xC9	0xC9	0xC9	0xC9	0xC9	latin capital letter e with acute
U+00CA	Ê	0xCA	0xCA		0xCA	0xCA	0xCA	latin capital letter e with circumflex
U+00CB	Ë	0xCB	0xCB	0xCB	0xCB	0xCB	0xCB	latin capital letter e with diaeresis
U+00CC	Ì	0xCC	0xCC		0xCC	0xCC	0xCC	latin capital letter i with grave
U+00CD	Í	0xCD	0xCD	0xCD	0xCD	0xCD	0xCD	latin capital letter i with acute
U+00CE	Î	0xCE	0xCE	0xCE	0xCE	0xCE	0xCE	latin capital letter i with circumflex
U+00CF	Ï	0xCF	0xCF		0xCF	0xCF	0xCF	latin capital letter i with diaeresis
U+00D0	Ð	0xD0	0xD0		n/a	0xD0		latin capital letter eth	Same glyph as U+0110 latin capital letter d with stroke
U+00D1	Ñ	0xD1	0xD1		0xD1	0xD1		latin capital letter n with tilde
U+00D2	Ò	0xD2	0xD2		0xD2	0xD2	0xD2	latin capital letter o with grave
U+00D3	Ó	0xD3	0xD3	0xC1	0xD3	0xD3	0xD3	latin capital letter o with acute
U+00D4	Ô	0xD4	0xD4	0xD4	0xD4	0xD4	0xD4	latin capital letter o with circumflex	Formerly also mapped from 0x88; redundant, Ğ has 0x88 codepoint now
U+00D5	Õ	0xD5	0xD5		0xD5	0xD5		latin capital letter o with tilde
U+00D6	Ö	0xD6	0xD6	0xD6	0xD6	0xD6	0xD6	latin capital letter o with diaeresis
U+00D7	×		0xD7	0xD7	0xD7	0xD7		multiplication sign
U+00D8	Ø	0xD8	0xD8		0xD8	0xD8		latin capital letter o with stroke
U+00D9	Ù	0xD9	0xD9		0xD9	0xD9	0xD9	latin capital letter u with grave
U+00DA	Ú	0xDA	0xDA	0xDA	0xDA	0xDA	0xDA	latin capital letter u with acute
U+00DB	Û	0xDB	0xDB		0xDB	0xDB	0xDB	latin capital letter u with circumflex
U+00DC	Ü	0xDC	0xDC	0xDC	0xDC	0xDC	0xDC	latin capital letter u with diaeresis
U+00DD	Ý	0xDD	0xDD	0xDD		0xDD		latin capital letter y with acute
U+00DE	Þ	0xDE	0xDE			0xDE		latin capital letter thorn
U+00DF	ß	0xDF	0xDF	0xDF	0xDF	0xDF	0xDF	latin small letter sharp s
U+00E0	à	0xE0	0xE0		0xE0	0xE0	0xE0	latin small letter a with grave
U+00E1	á	0xE1	0xE1	0xE1	0xE1	0xE1	0xE1	latin small letter a with acute
U+00E2	â	0xE2	0xE2	0xE2	0xE2	0xE2	0xE2	latin small letter a with circumflex
U+00E3	ã	0xE3	0xE3		0xE3	0xE3		latin small letter a with tilde
U+00E4	ä	0xE4	0xE4	0xE4	0xE4	0xE4	0xE4	latin small letter a with diaeresis
U+00E5	å	0xE5	0xE5		0xE5	0xE5		latin small letter a with ring above
U+00E6	æ	0xE6	0xE6		0xE6	0xE6	0xE6	latin small letter ae
U+00E7	ç	0xE7	0xE7	0xE7	0xE7	0xE7	0xE7	latin small letter c with cedilla
U+00E8	è	0xE8	0xE8		0xE8	0xE8	0xE8	latin small letter e with grave
U+00E9	é	0xE9	0xE9	0xE9	0xE9	0xE9	0xE9	latin small letter e with acute
U+00EA	ê	0xEA	0xEA	0xBD	0xEA	0xEA	0xEA	latin small letter e with circumflex	Hack for iso 2: Put ´ (acute accent, 0xBD) to get ê for "Português"
U+00EB	ë	0xEB	0xEB	0xEB	0xEB	0xEB	0xEB	latin small letter e with diaeresis
U+00EC	ì	0xEC	0xEC		0xEC	0xEC	0xEC	latin small letter i with grave
U+00ED	í	0xED	0xED	0xED	0xED	0xED	0xED	latin small letter i with acute
U+00EE	î	0xEE	0xEE	0xEE	0xEE	0xEE	0xEE	latin small letter i with circumflex
U+00EF	ï	0xEF	0xEF		0xEF	0xEF	0xEF	latin small letter i with diaeresis
U+00F0	ð	0xF0	0xF0			0xF0		latin small letter eth
U+00F1	ñ	0xF1	0xF1	0xA8	0xF1	0xF1	0xB6	latin small letter n with tilde	Hacks to get ñ for "Español": for iso 2, put ¨ (diaeresis 0xA8); for iso 16, put ¶ (0xB6)
U+00F2	ò	0xF2	0xF2		0xF2	0xF2	0xF2	latin small letter o with grave
U+00F3	ó	0xF3	0xF3	0xF3	0xF3	0xF3	0xF3	latin small letter o with acute
U+00F4	ô	0xF4	0xF4	0xF4	0xF4	0xF4	0xF4	latin small letter o with circumflex	Formerly also mapped from 0x88; redundant, ğ has TDM 0x88 codepoint now
U+00F5	õ	0xF5	0xF5		0xF5	0xF5		latin small letter o with tilde
U+00F6	ö	0xF6	0xF6	0xF6	0xF6	0xF6	0xF6	latin small letter o with diaeresis
U+00F7	÷		0xF7	0xF7	0xF7	0xF7		division sign
U+00F8	ø	0xF8	0xF8		0xF8	0xF8		latin small letter o with stroke
U+00F9	ù	0xF9	0xF9		0xF9	0xF9	0xF9	latin small letter u with grave
U+00FA	ú	0xFA	0xFA	0xFA	0xFA	0xFA	0xFA	latin small letter u with acute
U+00FB	û	0xFB	0xFB		0xFB	0xFB	0xFB	latin small letter u with circumflex
U+00FC	ü	0xFC	0xFC	0xFC	0xFC	0xFC	0xFC	latin small letter u with diaeresis
U+00FD	ý	0xFD	0xFD	0xFD		0xFD		latin small letter y with acute
U+00FE	þ	0xFE	0xFE			0xFE		latin small letter thorn
U+00FF	ÿ	0xFF	0xFF		0xFF	0xFF	0xFF	latin small letter y with diaeresis
U+0102	Ă	0x8B		0xC3			0xC3	latin capital letter a with breve
U+0103	ă	0x9B		0xE3			0xE3	latin small letter a with breve
U+0104	Ą	0xAA	ª	0xA1	ª	ª	0xA1	latin capital letter a with ogonek
U+0105	ą	0xBA	º	0xB1	º	º	0xA2	latin small letter a with ogonek
U+0106	Ć	0x82		0xC6			0xC5	latin capital letter c with acute
U+0107	ć	0x92		0xE6			0xE5	latin small letter c with acute
U+0108	Ĉ	0x86						latin capital c with circumflex	Only in ISO-8859-3 (for Esperanto) at 0xC6
U+0109	ĉ	0x96						latin small c with circumflex	Only in ISO-8859-3 (for Esperanto) at 0xE6
U+010C	Č	0xAC	¬	0xC8	¬	¬	0xB2	latin capital letter c with caron	Substitute char "not sign" at 0xAC
U+010D	č	0xAE	®	0xE8	®	®	0xB9	latin small letter c with caron	Substitute char "registration sign" at 0xAE
U+010E	Ď	0xB3	³	0xCF	³	³		latin capital letter d with caron	Substitute char "superscript three" at 0xB3
U+010F	ď	0xB7	·	0xEF	·	·		latin small letter d with caron	Substitute char "middle dot" at 0xB7
U+0110	Đ	0xD0	0xD0	0xD0	n/a		0xD0	latin capital letter d with stroke	Same glyph as U+00D0 latin capital letter eth
U+0111	đ	0x90		0xF0	n/a		0xF0	latin small letter d with stroke
U+0118	Ę	0xAB	«	0xCA	«	«	0xDD	latin capital letter e with ogonek	Substitute char "left-pointing double angle quotation mark" at 0xAB
U+0119	ę	0xBB	»	0xEA	»	»	0xFD	latin small letter e with ogonek	Substitute char "right-pointing double angle quotation mark" at 0xBB
U+011A	Ě	0xA5	¥	0xCC	¥	¥		latin capital letter e with caron	Substitute char "yen sign" at 0xA5
U+011B	ě	0xA3	£	0xEC	£	£		latin small letter e with caron	Substitute char "pound sign" at 0xA3
U+011E	Ğ	0x88			0xD0			latin capital letter g with breve	As of TDM 2.13 (TDM codemap), 2.15 (turkish.map)
U+011F	ğ	0x98			0xF0			latin small letter g with breve	As of TDM 2.13 (TDM codemap), 2.15 (turkish.map)
U+0130	İ				0xDD			latin capital letter i with dot above	Turkish: utf8 "İ" will be mapped to "Î" (0xCE)
U+0131	ı				0xFD			latin small letter dotless i	Turkish: utf8 "ı" will be mapped to ASCII "i" (0x69)
U+0139	Ĺ			0xC5				latin capital letter l with acute
U+013A	ĺ			0xE5				latin small letter l with acute
U+013D	Ľ			0xA5				latin capital letter l with caron
U+013E	ľ			0xB5				latin small letter l with caron
U+0141	Ł	0xB1	±	0xA3	±	±	0xA3	latin capital letter l with stroke	Substitute char "plus-minus sign" at 0xB1
U+0142	ł	0xB5	µ	0xB3	µ	µ	0xB3	latin small letter l with stroke	Substitute char "micro sign" at 0xB5
U+0143	Ń	0x8C		0xD1			0xD1	latin capital letter n with acute
U+0144	ń	0x9C		0xF1			0xF1	latin small letter n with acute
U+0147	Ň	0x80		0xD2				latin capital letter n with caron
U+0148	ň	0xA1	¡	0xF2	¡	¡		latin small letter n with caron	Substitute char "inverted exclamation mark" at 0xA1
U+0150	Ő	0xB0	°	0xD5	°	°	0xD5	latin capital letter o with double acute	Similiar to Ö, used in Hungarian. Substitute char "degree sign" at 0xB0
U+0151	ő	0xB9	¹	0xF5	¹	¹	0xF5	latin small letter o with double acute	Similiar to ö, used in Hungarian. Substitute char "superscript 1" at 0xB9
U+0152	Œ	0xBC	¼		¼	0xBC	0xBC	latin capital ligature oe	Substitute char "vulgar fraction one quarter" at 0xBC
U+0153	œ	0xBD	½		½	0xBD	0xBD	latin small ligature oe	Substitute char "vulgar fraction one half" at 0xBD
U+0154	Ŕ	0x89		0xC0				latin capital letter r with acute
U+0155	ŕ	0x99		0xE0				latin small letter r with acute
U+0158	Ř	0xD7	×	0xD8	×	×		latin capital letter r with caron	Substitute char "multiple sign" at 0xD7
U+0159	ř	0xF7	÷	0xF8	÷	÷		latin small letter r with caron	Substitute char "divide sign" at 0xF7
U+015A	Ś	0x81		0xA6			0xD7	latin capital letter s with acute
U+015B	ś	0x91		0xB6			0xF7	latin small letter s with acute
U+015C	Ŝ	0x85						latin capital letter s with circumflex	Only in ISO-8859-3 (for Esperanto) at 0xDE
U+015D	ŝ	0x95						latin small letter s with circumflex	Only in ISO-8859-3 (for Esperanto) at 0xFE
U+015E	Ş	0x8D		0xAA	0xDE			latin capital letter s with cedilla	Can stand in for "...comma under"
U+015F	ş	0x9D		0xBA	0xFE			latin small letter s with cedilla	Can stand in for "...comma under"
U+0160	Š	0xA6	¦	0xA9	¦	0xA6	0xA6	latin capital letter s with caron	Substitute char "broken bar" at 0xA6
U+0161	š	0xA8	¨	0xB9	¨	0xA8	0xA8	latin small letter s with caron	Substitute char "diaeresis" at 0xA8
U+0162	Ţ	0x8E		0xDE				latin capital letter t with cedilla	Can stand in for "...comma under"
U+0163	ţ	0x9E		0xFE				latin small letter t with cedilla	Can stand in for "...comma under"
U+0164	Ť	0xB2	²	0xAB	²	²		latin capital letter t with caron	Substitute char "superscript two" at 0xB2
U+0165	ť	0xB6	¶	0xBB	¶	¶		latin small letter t with caron	Substitute char "pilcrow sign" at 0xB6
U+016E	Ů	0xA9	©	0xD9	©	©		latin capital letter u with ring above	Substitute char "copyright sign" at 0xA9
U+016F	ů	0xAF	¯	0xF9	¯	¯		latin small letter u with ring above	Substitute char "macron" at 0xAF
U+0170	Ű	0xA2	¢	0xDB	¢	¢	0xD8	latin capital letter u with double acute	Similiar to Ü, used in Hungarian. Substitute char "cent sign" at 0xA2
U+0171	ű	0xA4	¤	0xFB	¤	¤	0xF8	latin small letter u with double acute	Similiar to Ü, used in Hungarian. Substitute char "currency sign" at 0xA4
U+0178	Ÿ	0xBE	¾		¾	0xBE	0xBE	latin capital letter y with diaeresis	Substitute char "vulgar fraction three quarters" at 0xBE
U+0179	Ź	0x84		0xAC			\| 0xAC	latin capital letter z with acute
U+017A	ź	0x94		0xBC			0xAE	latin small letter z with acute
U+017B	Ż	0x83		0xAF			0xAF	latin capital letter z with dot above
U+017C	ż	0x93		0xBF			0xBF	latin small letter z with dot above
U+017D	Ž	0xB4	´	0xAE	´	0xB4	0xB4	latin capital letter z with caron	Substitute char "accute accent" at 0xB4
U+017E	ž	0xB8	¸	0xBE	¸	0xB8	0xB8	latin small letter z with caron	Substitute char "cedilla" at 0xB8
U+01D3	Ǔ	0x8A						latin capital u with caron	Not found in ISO-8859. Pinyin tone marking
U+01D4	ǔ	0x9A						latin small u with caron	Not found in ISO-8859. Pinyin tone marking
U+0218	Ș	0x8D					0xAA	latin capital letter s with comma below	See also "...with cedilla"
U+0219	ș	0x9D					0xBA	latin small letter s with comma below	See also "...with cedilla"
U+021A	Ț	0x8E					0xDE	latin capital letter t with comma below	See also "...with cedilla"
U+021B	ț	0x9E					0xFE	latin small letter t with comma below	See also "...with cedilla"
U+02C7	ˇ				0xB7			caron
U+02D8	˘				0xA2			breve
U+02D9	˙				0xFF			dot above
U+02DB	˛				0xB2			ogonek
U+02DD	˝				0xBD			double acute accent
U+1E90	Ẑ	0x87						latin capital z with circumflex	Not found in ISO-8859. Rare use in Cyrillic-to-Latin transliteration, or Pinyin
U+1E91	ẑ	0x97						latin small z with circumflex	Not found in ISO-8859. Rare use in Cyrillic-to-Latin transliteration, or Pinyin
U+201D	”						0xB5	right double quotation mark
U+201E	„						0xA5	double low-9 quotation mark
U+20AC	€					0xA4	0xA4	euro sign

For More

I18N - Character mapping sketches the format and location of <language>.map files.
I18N - Charset is the main article about TDM language use and various encodings.

Multilanguage Display: Difference between revisions

Latest revision as of 17:44, 19 June 2026

Contents

Background

Language Names in Idealized UTF8 Form

First Compromise – Cyrillic vs. Latin Font

Moving between ISO-8859 Members

Type and Go

Tricks

Trick Method 1 – Direct Stuffing

Trick Method 2 – Special Mapping with Repurposed Character

TDM & ISO Char Sets - Details, Potential Tricks, Limitations

For More

Navigation menu

Page actions

Page actions

Personal tools

Navigation

Search

Tools

@@ Line 1: / Line 1: @@
-[PAGE IN PROGRESS]
 == Background ==
-''This describes a particular challenging use case of a TDM methodology that has been around for some time. The motivation was a 2026 update by Geep to the language map files, following a revision of Carleton 24pt font used throughout the Main Menu.''
+''This describes a particular challenging use case of a TDM methodology that has been around for some time. The motivation to detail this was a 2026 update by Geep to the language map files, following a revision of Carleton 24pt font used throughout the Main Menu.''
-In TDM main menu, the Settings/Video/Language page attempts to list all the supported languages in their native form with respect to character set, that is, untranslated. Showing multiple language strings together is difficult, but some tricks are available. The result has been a reasonable near-term workaround for this particular page, applicable to comparable cases. (Comprehensive support for multilanguage display likely involves a major restructuring, for instance, moving to a native Unicode architecture with combined Latin and Cyrillic bitmaps.)
+In TDM's main menu, the Settings/Video/Language page attempts to list all the supported languages in their native form with respect to character set, that is, untranslated. Showing multiple language strings together is difficult, but some tricks are available. The result has been a reasonable near-term workaround for this particular page, applicable to comparable cases. (Comprehensive support for multilanguage display likely involves a major restructuring, for instance, moving to a native Unicode architecture with combined Latin and Cyrillic bitmaps.)
-Recall that strings for the main menu are found in the utf8 file tdm_base01.pk4/strings/all.lang, where there is are sections for [English], [German], etc. A particular section is used to generate for distribution the corresponding language-specific file, e.g., french.lang, german.lang, etc. When you play TDM, and select a given language, the corresponding .lang file (if needed) is read.
+Recall that strings for the main menu are found in the utf8 file tdm_base01.pk4/strings/all.lang, where there are sections for [English], [German], etc. A particular section is used to generate for distribution the corresponding file, e.g., french.lang, german.lang, etc., in a language-specific 8-bit encoding. When you play TDM, and select a given language, the corresponding .lang file (if provided) is read.
-Except for Russian, these files use a specific 8-bit encoding within the ISO-8859 family. (The i18n – Char page has specifics here.) This specific encoding is a bottleneck... by design, not all characters in any one member of the ISO-8859 family will be accessible by some other members.
+Except for Cyrillic Russian, these files use encodings within the ISO-8859 family (as detailed in [[I18N_-_Charset]]). Any specific encoding is a bottleneck... by design, not all characters in any one member of the ISO-8859 family will be accessible by some other members.
 == Language Names in Idealized UTF8 Form ==
@@ Line 104: / Line 102: @@
 *for iso 2: Put ´ (acute accent 0xBD)
-== Details of TDM and ISO Char Sets and Potential Tricks and Limitations ==
+== TDM & ISO Char Sets - Details, Potential Tricks, Limitations ==
-[WORK IN PROGRESS][COLOR CODING NOT RIGHT YET]
 The main table below (after the Key) was drafted by Google AI, from a prompt that defined the header and an example row, and specified:
-  "Rows are a combine of all the printable 8-bit codepoints in the range 0x80-0xff defined in ISO-8859-1, -2, -9, -15, and -16. The rows are ordered by Unicode number. If a particular ISO standard does not include the character, leave that cell blank."
+  "Rows are a combination of all the printable 8-bit codepoints in the range 0x80-0xff defined in ISO-8859-1, -2, -9, -15, and -16. The rows are ordered by Unicode number. If a particular ISO standard does not include the character, leave that cell blank."
+Subsequently, after Excel import, the TDM and Comments columns were filled in by Geep, and color coding added. Language mappings were independently color coded, then verified against existing language maps. Finally converted to wikitable format.
-Subsequently, the TDM and Comments columns were filled in by Geep, and color coding added.
-Language mappings were independently color coded, then verified against existing language maps.
 This table skips codepoints in the range 0x00-0x7f, because they are the same for all ISO-8859 standards. No mapping required.
@@ Line 298: / Line 295: @@
 |-
-| U+00D0 || Ð || 0xD0 || 0xD0 || || style="color:#ff0000; font-weight:bold;" | n/a || 0xD0 || || latin capital letter eth || Same glyph as U+0110 latin capital letter d with stroke
+| U+00D0 || Ð || 0xD0 || 0xD0 || || style="background:#43a8cc;" | n/a || 0xD0 || || latin capital letter eth || Same glyph as U+0110 latin capital letter d with stroke
 |-
@@ Line 376: / Line 373: @@
 |-
-| U+00EA || ê || 0xEA || 0xEA || style="background:#ffe4b5;" | 0xBD || 0xEA || 0xEA || 0xEA || latin small letter e with circumflex || Hack for iso 2: Put ´ (acute accent, 0xBD) to get ê for "Português"
+| U+00EA || ê || 0xEA || 0xEA || style="background:#ebbb54;color:#ff0000;" | 0xBD || 0xEA || 0xEA || 0xEA || latin small letter e with circumflex || Hack for iso 2: Put ´ (acute accent, 0xBD) to get ê for "Português"
 |-
@@ Line 397: / Line 394: @@
 |-
-| U+00F1 || ñ || 0xF1 || 0xF1 || style="background:#ffe4b5;" | 0xA8 || 0xF1 || 0xF1 || style="background:#ffe4b5;" | 0xB6 || latin small letter n with tilde || Hacks to get ñ for "Español": for iso 2, put ¨ (diaeresis 0xA8); for iso 16, put ¶ (0xB6)
+| U+00F1 || ñ || 0xF1 || 0xF1 || style="background:#ebbb54;color:#ff0000;" | 0xA8 || 0xF1 || 0xF1 || style="background:#ebbb54;color:#ff0000;" | 0xB6 || latin small letter n with tilde || Hacks to get ñ for "Español": for iso 2, put ¨ (diaeresis 0xA8); for iso 16, put ¶ (0xB6)
 |-
@@ Line 442: / Line 439: @@
 |-
-| U+0102 || Ă || 0x8B || || style="background:#ffe4b5;" | 0xC3 || || || style="background:#ffe4b5;" | 0xC3 || latin capital letter a with breve ||
+| U+0102 || Ă || 0x8B || || style="background:#ebbb54;" | 0xC3 || || || style="background:#ebbb54;" | 0xC3 || latin capital letter a with breve ||
 |-
-| U+0103 || ă || 0x9B || || style="background:#ffe4b5;" | 0xE3 || || || style="background:#ffe4b5;" | 0xE3 || latin small letter a with breve ||
+| U+0103 || ă || 0x9B || || style="background:#ebbb54;" | 0xE3 || || || style="background:#ebbb54;" | 0xE3 || latin small letter a with breve ||
 |-
-| U+0104 || Ą || 0xAA || style="color:#7a7a7a; font-style:italic;" | ª || 0xA1 || style="color:#7a7a7a; font-style:italic;" | ª || style="color:#7a7a7a; font-style:italic;" | ª || 0xA1 || latin capital letter a with ogonek ||
+| U+0104 || Ą || 0xAA || style="background:#9ab4e3" | ª || style="background:#ebbb54;" | 0xA1 || style="background:#9ab4e3" | ª || style="background:#9ab4e3" | ª || style="background:#ebbb54;" | 0xA1 || latin capital letter a with ogonek ||
 |-
-| U+0105 || ą || 0xBA || style="color:#7a7a7a; font-style:italic;" | º || 0xB1 || style="color:#7a7a7a; font-style:italic;" | º || style="color:#7a7a7a; font-style:italic;" | º || 0xA2 || latin small letter a with ogonek ||
+| U+0105 || ą || 0xBA || style="background:#9ab4e3" | º || style="background:#ebbb54;" | 0xB1 || style="background:#9ab4e3" | º || style="background:#9ab4e3" | º || style="background:#ebbb54;" | 0xA2 || latin small letter a with ogonek ||
 |-
-| U+0106 || Ć || 0x82 || || 0xC6 || || || 0xC5 || latin capital letter c with acute ||
+| U+0106 || Ć || 0x82 || || style="background:#ebbb54;" | 0xC6 || || || style="background:#ebbb54;" | 0xC5 || latin capital letter c with acute ||
 |-
-| U+0107 || ć || 0x92 || || 0xE6 || || || 0xE5 || latin small letter c with acute ||
+| U+0107 || ć || 0x92 || || style="background:#ebbb54;" | 0xE6 || || || style="background:#ebbb54;" | 0xE5 || latin small letter c with acute ||
-|-
+|- style="background:#dbab8a;"
 | U+0108 || Ĉ || 0x86 || || || || || || latin capital c with circumflex || Only in ISO-8859-3 (for Esperanto) at 0xC6
-|-
+|- style="background:#dbab8a;"
 | U+0109 || ĉ || 0x96 || || || || || || latin small c with circumflex || Only in ISO-8859-3 (for Esperanto) at 0xE6
 |-
-| U+010C || Č || 0xAC || style="color:#7a7a7a; font-style:italic;" | ¬ || 0xC8 || style="color:#7a7a7a; font-style:italic;" | ¬ || style="color:#7a7a7a; font-style:italic;" | ¬ || 0xB2 || latin capital letter c with caron || Substitute char "not sign" at 0xAC
+| U+010C || Č || 0xAC || style="background:#9ab4e3" | ¬ || style="background:#ebbb54;" | 0xC8 || style="background:#9ab4e3" | ¬ || style="background:#9ab4e3" | ¬ || style="background:#ebbb54;" | 0xB2 || latin capital letter c with caron || Substitute char "not sign" at 0xAC
 |-
-| U+010D || č || 0xAE || style="color:#7a7a7a; font-style:italic;" | ® || 0xE8 || style="color:#7a7a7a; font-style:italic;" | ® || style="color:#7a7a7a; font-style:italic;" | ® || 0xB9 || latin small letter c with caron || Substitute char "registration sign" at 0xAE
+| U+010D || č || 0xAE || style="background:#9ab4e3" | ® || style="background:#ebbb54;" | 0xE8 || style="background:#9ab4e3" | ® || style="background:#9ab4e3" | ® || style="background:#ebbb54;" | 0xB9 || latin small letter c with caron || Substitute char "registration sign" at 0xAE
 |-
-| U+010E || Ď || 0xB3 || style="color:#7a7a7a; font-style:italic;" | ³ || 0xCF || style="color:#7a7a7a; font-style:italic;" | ³ || style="color:#7a7a7a; font-style:italic;" | ³ || || latin capital letter d with caron || Substitute char "superscript three" at 0xB3
+| U+010E || Ď || 0xB3 || style="background:#9ab4e3" | ³ || style="background:#ebbb54;" | 0xCF || style="background:#9ab4e3" | ³ || style="background:#9ab4e3" | ³ || || latin capital letter d with caron || Substitute char "superscript three" at 0xB3
 |-
-| U+010F || ď || 0xB7 || style="color:#7a7a7a; font-style:italic;" | · || 0xEF || style="color:#7a7a7a; font-style:italic;" | · || style="color:#7a7a7a; font-style:italic;" | · || || latin small letter d with caron || Substitute char "middle dot" at 0xB7
+| U+010F || ď || 0xB7 || style="background:#9ab4e3; font-weight:bold" | · || style="background:#ebbb54;" | 0xEF || style="background:#9ab4e3; font-weight:bold" | · || style="background:#9ab4e3; font-weight:bold" | · || || latin small letter d with caron || Substitute char "middle dot" at 0xB7
 |-
-| U+0110 || Đ || 0xD0 || 0xD0 || 0xD0 || style="color:#ff0000; font-weight:bold;" | n/a || || 0xD0 || latin capital letter d with stroke || Same glyph as U+00D0 latin capital letter eth
+| U+0110 || Đ || 0xD0 || 0xD0 || 0xD0 || style="background:#43a8cc;" | n/a || || 0xD0 || latin capital letter d with stroke || Same glyph as U+00D0 latin capital letter eth
 |-
-| U+0111 || đ || 0x90 || || 0xF0 || style="color:#ff0000; font-weight:bold;" | n/a || || 0xF0 || latin small letter d with stroke ||
+| U+0111 || đ || 0x90 || || style="background:#ebbb54;" | 0xF0 || style="background:#43a8cc;" | n/a || || style="background:#ebbb54;" | 0xF0 || latin small letter d with stroke ||
 |-
-| U+0118 || Ę || 0xAB || style="color:#7a7a7a; font-style:italic;" | « || style="background:#ffe4b5;" | 0xCA || style="color:#7a7a7a; font-style:italic;" | « || style="color:#7a7a7a; font-style:italic;" | « || 0xDD || latin capital letter e with ogonek || Substitute char "left-pointing double angle quotation mark" at 0xAB
+| U+0118 || Ę || 0xAB || style="background:#9ab4e3" | « || style="background:#ebbb54;" | 0xCA || style="background:#9ab4e3" | « || style="background:#9ab4e3" | « || style="background:#ebbb54;" | 0xDD || latin capital letter e with ogonek || Substitute char "left-pointing double angle quotation mark" at 0xAB
 |-
-| U+0119 || ę || 0xBB || style="color:#7a7a7a; font-style:italic;" | » || style="background:#ffe4b5;" | 0xEA || style="color:#7a7a7a; font-style:italic;" | » || style="color:#7a7a7a; font-style:italic;" | » || 0xFD || latin small letter e with ogonek || Substitute char "right-pointing double angle quotation mark" at 0xBB
+| U+0119 || ę || 0xBB || style="background:#9ab4e3" | » || style="background:#ebbb54;" | 0xEA || style="background:#9ab4e3" | » || style="background:#9ab4e3" | » || style="background:#ebbb54;" | 0xFD || latin small letter e with ogonek || Substitute char "right-pointing double angle quotation mark" at 0xBB
 |-
-| U+011A || Ě || 0xA5 || style="color:#7a7a7a; font-style:italic;" | ¥ || 0xCC || style="color:#7a7a7a; font-style:italic;" | ¥ || style="color:#7a7a7a; font-style:italic;" | ¥ || || latin capital letter e with caron || Substitute char "yen sign" at 0xA5
+| U+011A || Ě || 0xA5 || style="background:#9ab4e3" | ¥ || style="background:#ebbb54;" | 0xCC || style="background:#9ab4e3" | ¥ || style="background:#9ab4e3" | ¥ || || latin capital letter e with caron || Substitute char "yen sign" at 0xA5
 |-
-| U+011B || ě || 0xA3 || style="color:#7a7a7a; font-style:italic;" | £ || 0xEC || style="color:#7a7a7a; font-style:italic;" | £ || style="color:#7a7a7a; font-style:italic;" | £ || || latin small letter e with caron || Substitute char "pound sign" at 0xA3
+| U+011B || ě || 0xA3 || style="background:#9ab4e3" | £ || style="background:#ebbb54;" | 0xEC || style="background:#9ab4e3" | £ || style="background:#9ab4e3" | £ || || latin small letter e with caron || Substitute char "pound sign" at 0xA3
 |-
-| U+011E || Ğ || 0x88 || || || 0xD0 || || || latin capital letter g with breve || As of TDM 2.13
+| U+011E || Ğ || 0x88 || || || style="background:#ebbb54;" | 0xD0 || || || latin capital letter g with breve || As of TDM 2.13 (TDM codemap), 2.15 (turkish.map)
 |-
-| U+011F || ğ || 0x98 || || || 0xF0 || || || latin small letter g with breve || As of TDM 2.13
+| U+011F || ğ || 0x98 || || || style="background:#ebbb54;" | 0xF0 || || || latin small letter g with breve || As of TDM 2.13 (TDM codemap), 2.15 (turkish.map)
 |- style="color:#8c8c8c;"
-| U+0130 || İ || || || || style="background:#ffe4b5;" | 0xDD || || || latin capital letter i with dot above || Turkish: utf8 "İ" will be mapped to "Î" (0xCE)
+| U+0130 || İ || || || || style="background:#ebbb54;" | 0xDD || || || latin capital letter i with dot above || Turkish: utf8 "İ" will be mapped to "Î" (0xCE)
 |- style="color:#8c8c8c;"
-| U+0131 || ı || || || || style="background:#ffe4b5;" | 0xFD || || || latin small letter dotless i || Turkish: utf8 "ı" will be mapped to ASCII "i" (0x69)
+| U+0131 || ı || || || || style="background:#ebbb54;" | 0xFD || || || latin small letter dotless i || Turkish: utf8 "ı" will be mapped to ASCII "i" (0x69)
 |- style="color:#8c8c8c;"
@@ Line 520: / Line 517: @@
 |-
-| U+0141 || Ł || 0xB1 || style="color:#7a7a7a; font-style:italic;" | ± || 0xA3 || style="color:#7a7a7a; font-style:italic;" | ± || style="color:#7a7a7a; font-style:italic;" | ± || 0xA3 || latin capital letter l with stroke || Substitute char "plus-minus sign" at 0xB1
+| U+0141 || Ł || 0xB1 || style="background:#9ab4e3" | ± || style="background:#ebbb54;" | 0xA3 || style="background:#9ab4e3" | ± || style="background:#9ab4e3" | ± || style="background:#ebbb54;" | 0xA3 || latin capital letter l with stroke || Substitute char "plus-minus sign" at 0xB1
 |-
-| U+0142 || ł || 0xB5 || style="color:#7a7a7a; font-style:italic;" | µ || 0xB3 || style="color:#7a7a7a; font-style:italic;" | µ || style="color:#7a7a7a; font-style:italic;" | µ || 0xB3 || latin small letter l with stroke || Substitute char "micro sign" at 0xB5
+| U+0142 || ł || 0xB5 || style="background:#9ab4e3" | µ || style="background:#ebbb54;" | 0xB3 || style="background:#9ab4e3" | µ || style="background:#9ab4e3" | µ || style="background:#ebbb54;" | 0xB3 || latin small letter l with stroke || Substitute char "micro sign" at 0xB5
 |-
-| U+0143 || Ń || 0x8C || || 0xD1 || || || 0xD1 || latin capital letter n with acute ||
+| U+0143 || Ń || 0x8C || || style="background:#ebbb54;" | 0xD1 || || || style="background:#ebbb54;" | 0xD1 || latin capital letter n with acute ||
 |-
-| U+0144 || ń || 0x9C || || 0xF1 || || || 0xF1 || latin small letter n with acute ||
+| U+0144 || ń || 0x9C || || style="background:#ebbb54;" | 0xF1 || || || style="background:#ebbb54;" | 0xF1 || latin small letter n with acute ||
 |-
-| U+0147 || Ň || 0x80 || || 0xD2 || || || || latin capital letter n with caron ||
+| U+0147 || Ň || 0x80 || || style="background:#ebbb54;" | 0xD2 || || || || latin capital letter n with caron ||
 |-
-| U+0148 || ň || 0xA1 || style="color:#7a7a7a; font-style:italic;" | ¡ || 0xF2 || style="color:#7a7a7a; font-style:italic;" | ¡ || style="color:#7a7a7a; font-style:italic;" | ¡ || || latin small letter n with caron || Substitute char "inverted exclamation mark" at 0xA1
+| U+0148 || ň || 0xA1 || style="background:#9ab4e3" | ¡ || style="background:#ebbb54;" | 0xF2 || style="background:#9ab4e3" | ¡ || style="background:#9ab4e3" | ¡ || || latin small letter n with caron || Substitute char "inverted exclamation mark" at 0xA1
 |-
-| U+0150 || Ő || 0xB0 || style="color:#7a7a7a; font-style:italic;" | ° || style="background:#ffe4b5;" | 0xD5 || style="color:#7a7a7a; font-style:italic;" | ° || style="color:#7a7a7a; font-style:italic;" | ° || style="background:#ffe4b5;" | 0xD5 || latin capital letter o with double acute || Similiar to Ö, used in Hungarian. Substitute char "degree sign" at 0xB0
+| U+0150 || Ő || 0xB0 || style="background:#9ab4e3" | ° || style="background:#ebbb54;" | 0xD5 || style="background:#9ab4e3" | ° || style="background:#9ab4e3" | ° || style="background:#ebbb54;" | 0xD5 || latin capital letter o with double acute || Similiar to Ö, used in Hungarian. Substitute char "degree sign" at 0xB0
 |-
-| U+0151 || ő || 0xB9 || style="color:#7a7a7a; font-style:italic;" | ¹ || style="background:#ffe4b5;" | 0xF5 || style="color:#7a7a7a; font-style:italic;" | ¹ || style="color:#7a7a7a; font-style:italic;" | ¹ || style="background:#ffe4b5;" | 0xF5 || latin small letter o with double acute || Similiar to ö, used in Hungarian. Substitute char "superscript 1" at 0xB9
+| U+0151 || ő || 0xB9 || style="background:#9ab4e3" | ¹ || style="background:#ebbb54;" | 0xF5 || style="background:#9ab4e3" | ¹ || style="background:#9ab4e3" | ¹ || style="background:#ebbb54;" | 0xF5 || latin small letter o with double acute || Similiar to ö, used in Hungarian. Substitute char "superscript 1" at 0xB9
 |-
-| U+0152 || Œ || 0xBC || style="color:#7a7a7a; font-style:italic;" | ¼ || || style="color:#7a7a7a; font-style:italic;" | ¼ || 0xBC || 0xBC || latin capital ligature oe || Substitute char "vulgar fraction one quarter" at 0xBC
+| U+0152 || Œ || 0xBC || style="background:#9ab4e3" | ¼ || || style="background:#9ab4e3" | ¼ || 0xBC || 0xBC || latin capital ligature oe || Substitute char "vulgar fraction one quarter" at 0xBC
 |-
-| U+0153 || œ || 0xBD || style="color:#7a7a7a; font-style:italic;" | ½ || || style="color:#7a7a7a; font-style:italic;" | ½ || 0xBD || 0xBD || latin small ligature oe || Substitute char "vulgar fraction one half" at 0xBD
+| U+0153 || œ || 0xBD || style="background:#9ab4e3" | ½ || || style="background:#9ab4e3" | ½ || 0xBD || 0xBD || latin small ligature oe || Substitute char "vulgar fraction one half" at 0xBD
 |-
-| U+0154 || Ŕ || 0x89 || || 0xC0 || || || || latin capital letter r with acute ||
+| U+0154 || Ŕ || 0x89 || || style="background:#ebbb54;" | 0xC0 || || || || latin capital letter r with acute ||
 |-
-| U+0155 || ŕ || 0x99 || || 0xE0 || || || || latin small letter r with acute ||
+| U+0155 || ŕ || 0x99 || || style="background:#ebbb54;" | 0xE0 || || || || latin small letter r with acute ||
 |-
-| U+0158 || Ř || 0xD7 || style="color:#7a7a7a; font-style:italic;" | × || style="background:#ffe4b5;" | 0xD8 || style="color:#7a7a7a; font-style:italic;" | × || style="color:#7a7a7a; font-style:italic;" | × || || latin capital letter r with caron || Substitute char "multiple sign" at 0xD7
+| U+0158 || Ř || 0xD7 || style="background:#9ab4e3" | × || style="background:#ebbb54;" | 0xD8 || style="background:#9ab4e3" | × || style="background:#9ab4e3" | × || || latin capital letter r with caron || Substitute char "multiple sign" at 0xD7
 |-
-| U+0159 || ř || 0xF7 || style="color:#7a7a7a; font-style:italic;" | ÷ || style="background:#ffe4b5;" | 0xF8 || style="color:#7a7a7a; font-style:italic;" | ÷ || style="color:#7a7a7a; font-style:italic;" | ÷ || || latin small letter r with caron || Substitute char "divide sign" at 0xF7
+| U+0159 || ř || 0xF7 || style="background:#9ab4e3" | ÷ || style="background:#ebbb54;" | 0xF8 || style="background:#9ab4e3" | ÷ || style="background:#9ab4e3" | ÷ || || latin small letter r with caron || Substitute char "divide sign" at 0xF7
 |-
-| U+015A || Ś || 0x81 || || 0xA6 || || || style="background:#ffe4b5;" | 0xD7 || latin capital letter s with acute ||
+| U+015A || Ś || 0x81 || || style="background:#ebbb54;" | 0xA6 || || || style="background:#ebbb54;" | 0xD7 || latin capital letter s with acute ||
 |-
-| U+015B || ś || 0x91 || || 0xB6 || || || style="background:#ffe4b5;" | 0xF7 || latin small letter s with acute ||
+| U+015B || ś || 0x91 || || style="background:#ebbb54;" | 0xB6 || || || style="background:#ebbb54;" | 0xF7 || latin small letter s with acute ||
-|-
+|- style="background:#dbab8a;"
 | U+015C || Ŝ || 0x85 || || || || || || latin capital letter s with circumflex || Only in ISO-8859-3 (for Esperanto) at 0xDE
-|-
+|- style="background:#dbab8a;"
 | U+015D || ŝ || 0x95 || || || || || || latin small letter s with circumflex || Only in ISO-8859-3 (for Esperanto) at 0xFE
 |-
-| U+015E || Ş || 0x8D || || 0xAA || style="background:#ffe4b5;" | 0xDE || || || latin capital letter s with cedilla || Can stand in for "...comma under"
+| U+015E || Ş || 0x8D || || style="background:#ebbb54;" | 0xAA || style="background:#ebbb54;" | 0xDE || || || latin capital letter s with cedilla || Can stand in for "...comma under"
 |-
-| U+015F || ş || 0x9D || || 0xBA || style="background:#ffe4b5;" | 0xFE || || || latin small letter s with cedilla || Can stand in for "...comma under"
+| U+015F || ş || 0x9D || || style="background:#ebbb54;" | 0xBA || style="background:#ebbb54;" | 0xFE || || || latin small letter s with cedilla || Can stand in for "...comma under"
 |-
-| U+0160 || Š || 0xA6 || style="color:#7a7a7a; font-style:italic;" | ¦ || 0xA9 || style="color:#7a7a7a; font-style:italic;" | ¦ || 0xA6 || 0xA6 || latin capital letter s with caron || Substitute char "broken bar" at 0xA6
+| U+0160 || Š || 0xA6 || style="background:#9ab4e3" | ¦ || style="background:#ebbb54;" | 0xA9 || style="background:#9ab4e3" | ¦ || 0xA6 || 0xA6 || latin capital letter s with caron || Substitute char "broken bar" at 0xA6
 |-
-| U+0161 || š || 0xA8 || style="color:#7a7a7a; font-style:italic;" | ¨ || 0xB9 || style="color:#7a7a7a; font-style:italic;" | ¨ || 0xA8 || 0xA8 || latin small letter s with caron || Substitute char "diaeresis" at 0xA8
+| U+0161 || š || 0xA8 || style="background:#9ab4e3" | ¨ || style="background:#ebbb54;" | 0xB9 || style="background:#9ab4e3" | ¨ || 0xA8 || 0xA8 || latin small letter s with caron || Substitute char "diaeresis" at 0xA8
 |-
-| U+0162 || Ţ || 0x8E || || style="background:#ffe4b5;" | 0xDE || || || || latin capital letter t with cedilla || Can stand in for "...comma under"
+| U+0162 || Ţ || 0x8E || || style="background:#ebbb54;" | 0xDE || || || || latin capital letter t with cedilla || Can stand in for "...comma under"
 |-
-| U+0163 || ţ || 0x9E || || style="background:#ffe4b5;" | 0xFE || || || || latin small letter t with cedilla || Can stand in for "...comma under"
+| U+0163 || ţ || 0x9E || || style="background:#ebbb54;" | 0xFE || || || || latin small letter t with cedilla || Can stand in for "...comma under"
 |-
-| U+0164 || Ť || 0xB2 || style="color:#7a7a7a; font-style:italic;" | ² || 0xAB || style="color:#7a7a7a; font-style:italic;" | ² || style="color:#7a7a7a; font-style:italic;" | ² || || latin capital letter t with caron || Substitute char "superscript two" at 0xB2
+| U+0164 || Ť || 0xB2 || style="background:#9ab4e3" | ² || style="background:#ebbb54;" | 0xAB || style="background:#9ab4e3" | ² || style="background:#9ab4e3" | ² || || latin capital letter t with caron || Substitute char "superscript two" at 0xB2
 |-
-| U+0165 || ť || 0xB6 || style="color:#7a7a7a; font-style:italic;" | ¶ || 0xBB || style="color:#7a7a7a; font-style:italic;" | ¶ || style="color:#7a7a7a; font-style:italic;" | ¶ || || latin small letter t with caron || Substitute char "pilcrow sign" at 0xB6
+| U+0165 || ť || 0xB6 || style="background:#9ab4e3" | ¶ || style="background:#ebbb54;" | 0xBB || style="background:#9ab4e3" | ¶ || style="background:#9ab4e3" | ¶ || || latin small letter t with caron || Substitute char "pilcrow sign" at 0xB6
 |-
-| U+016E || Ů || 0xA9 || style="color:#7a7a7a; font-style:italic;" | © || 0xD9 || style="color:#7a7a7a; font-style:italic;" | © || style="color:#7a7a7a; font-style:italic;" | © || || latin capital letter u with ring above || Substitute char "copyright sign" at 0xA9
+| U+016E || Ů || 0xA9 || style="background:#9ab4e3" | © || style="background:#ebbb54;" | 0xD9 || style="background:#9ab4e3" | © || style="background:#9ab4e3" | © || || latin capital letter u with ring above || Substitute char "copyright sign" at 0xA9
 |-
-| U+016F || ů || 0xAF || style="color:#7a7a7a; font-style:italic;" | ¯ || 0xF9 || style="color:#7a7a7a; font-style:italic;" | ¯ || style="color:#7a7a7a; font-style:italic;" | ¯ || || latin small letter u with ring above || Substitute char "macron" at 0xAF
+| U+016F || ů || 0xAF || style="background:#9ab4e3" | ¯ || style="background:#ebbb54;" | 0xF9 || style="background:#9ab4e3" | ¯ || style="background:#9ab4e3" | ¯ || || latin small letter u with ring above || Substitute char "macron" at 0xAF
 |-
-| U+0170 || Ű || 0xA2 || style="color:#7a7a7a; font-style:italic;" | ¢ || 0xDB || style="color:#7a7a7a; font-style:italic;" | ¢ || style="color:#7a7a7a; font-style:italic;" | ¢ || style="background:#ffe4b5;" | 0xD8 || latin capital letter u with double acute || Similiar to Ü, used in Hungarian. Substitute char "cent sign" at 0xA2
+| U+0170 || Ű || 0xA2 || style="background:#9ab4e3" | ¢ || style="background:#ebbb54;" | 0xDB || style="background:#9ab4e3" | ¢ || style="background:#9ab4e3" | ¢ || style="background:#ebbb54;" | 0xD8 || latin capital letter u with double acute || Similiar to Ü, used in Hungarian. Substitute char "cent sign" at 0xA2
 |-
-| U+0171 || ű || 0xA4 || style="color:#7a7a7a; font-style:italic;" | ¤ || 0xFB || style="color:#7a7a7a; font-style:italic;" | ¤ || style="color:#7a7a7a; font-style:italic;" | ¤ || style="background:#ffe4b5;" | 0xF8 || latin small letter u with double acute || Similiar to Ü, used in Hungarian. Substitute char "currency sign" at 0xA4
+| U+0171 || ű || 0xA4 || style="background:#9ab4e3" | ¤ || style="background:#ebbb54;" | 0xFB || style="background:#9ab4e3" | ¤ || style="background:#9ab4e3" | ¤ || style="background:#ebbb54;" | 0xF8 || latin small letter u with double acute || Similiar to Ü, used in Hungarian. Substitute char "currency sign" at 0xA4
 |-
-| U+0178 || Ÿ || 0xBE || style="color:#7a7a7a; font-style:italic;" | ¾ || || style="color:#7a7a7a; font-style:italic;" | ¾ || 0xBE || 0xBE || latin capital letter y with diaeresis || Substitute char "vulgar fraction three quarters" at 0xBE
+| U+0178 || Ÿ || 0xBE || style="background:#9ab4e3" | ¾ || || style="background:#9ab4e3" | ¾ || 0xBE || 0xBE || latin capital letter y with diaeresis || Substitute char "vulgar fraction three quarters" at 0xBE
 |-
-| U+0179 || Ź || 0x84 || || 0xAC || || || style="color:#7a7a7a; font-style:italic;" | 0xAC || latin capital letter z with acute ||
+| U+0179 || Ź || 0x84 || || style="background:#ebbb54;" | 0xAC || || || style="background:#ebbb54;" |  | 0xAC || latin capital letter z with acute ||
 |-
-| U+017A || ź || 0x94 || || 0xBC || || || style="background:#ffe4b5;" | 0xAE || latin small letter z with acute ||
+| U+017A || ź || 0x94 || || style="background:#ebbb54;" | 0xBC || || || style="background:#ebbb54;" | 0xAE || latin small letter z with acute ||
 |-
-| U+017B || Ż || 0x83 || || 0xAF || || || style="background:#ffe4b5;" | 0xAF || latin capital letter z with dot above ||
+| U+017B || Ż || 0x83 || || style="background:#ebbb54;" | 0xAF || || || style="background:#ebbb54;" | 0xAF || latin capital letter z with dot above ||
 |-
-| U+017C || ż || 0x93 || || 0xBF || || || style="background:#ffe4b5;" | 0xBF || latin small letter z with dot above ||
+| U+017C || ż || 0x93 || || style="background:#ebbb54;" | 0xBF || || || style="background:#ebbb54;" | 0xBF || latin small letter z with dot above ||
 |-
-| U+017D || Ž || 0xB4 || style="color:#7a7a7a; font-style:italic;" | ´ || 0xAE || style="color:#7a7a7a; font-style:italic;" | ´ || 0xB4 || 0xB4 || latin capital letter z with caron || Substitute char "accute accent" at 0xB4
+| U+017D || Ž || 0xB4 || style="background:#9ab4e3" | ´ || style="background:#ebbb54;" | 0xAE || style="background:#9ab4e3" | ´ || 0xB4 || 0xB4 || latin capital letter z with caron || Substitute char "accute accent" at 0xB4
 |-
-| U+017E || ž || 0xB8 || style="color:#7a7a7a; font-style:italic;" | ¸ || 0xBE || style="color:#7a7a7a; font-style:italic;" | ¸ || 0xB8 || 0xB8 || latin small letter z with caron || Substitute char "cedilla" at 0xB8
+| U+017E || ž || 0xB8 || style="background:#9ab4e3" | ¸ || style="background:#ebbb54;" | 0xBE || style="background:#9ab4e3" | ¸ || 0xB8 || 0xB8 || latin small letter z with caron || Substitute char "cedilla" at 0xB8
-|-
+|- style="background:#dbab8a;"
 | U+01D3 || Ǔ || 0x8A || || || || || || latin capital u with caron || Not found in ISO-8859. Pinyin tone marking
-|-
+|- style="background:#dbab8a;"
 | U+01D4 || ǔ || 0x9A || || || || || || latin small u with caron || Not found in ISO-8859. Pinyin tone marking
 |-
-| U+0218 || Ș || 0x8D || || || || || style="background:#ffe4b5;" | 0xAA || latin capital letter s with comma below || See also "...with cedilla"
+| U+0218 || Ș || 0x8D || || || || || style="background:#ebbb54;" | 0xAA || latin capital letter s with comma below || See also "...with cedilla"
 |-
-| U+0219 || ș || 0x9D || || || || || style="background:#ffe4b5;" | 0xBA || latin small letter s with comma below || See also "...with cedilla"
+| U+0219 || ș || 0x9D || || || || || style="background:#ebbb54;" | 0xBA || latin small letter s with comma below || See also "...with cedilla"
 |-
-| U+021A || Ț || 0x8E || || || || || style="background:#ffe4b5;" | 0xDE || latin capital letter t with comma below || See also "...with cedilla"
+| U+021A || Ț || 0x8E || || || || || style="background:#ebbb54;" | 0xDE || latin capital letter t with comma below || See also "...with cedilla"
 |-
-| U+021B || ț || 0x9E || || || || || style="background:#ffe4b5;" | 0xFE || latin small letter t with comma below || See also "...with cedilla"
+| U+021B || ț || 0x9E || || || || || style="background:#ebbb54;" | 0xFE || latin small letter t with comma below || See also "...with cedilla"
 |- style="color:#8c8c8c;"
@@ Line 655: / Line 652: @@
 |- style="color:#8c8c8c;"
-| U+02D9 || ˙ || || || || style="background:#ffe4b5;" | 0xFF || || || dot above ||
+| U+02D9 || ˙ || || || || 0xFF || || || dot above ||
 |- style="color:#8c8c8c;"
@@ Line 661: / Line 658: @@
 |- style="color:#8c8c8c;"
-| U+02DD || ˝ || || || || style="background:#ffe4b5;" | 0xBD || || || double acute accent ||
+| U+02DD || ˝ || || || || 0xBD || || || double acute accent ||
-|-
+|- style="background:#dbab8a;"
 | U+1E90 || Ẑ || 0x87 || || || || || || latin capital z with circumflex || Not found in ISO-8859. Rare use in Cyrillic-to-Latin transliteration, or Pinyin
-|-
+|- style="background:#dbab8a;"
 | U+1E91 || ẑ || 0x97 || || || || || || latin small z with circumflex || Not found in ISO-8859. Rare use in Cyrillic-to-Latin transliteration, or Pinyin
 |- style="color:#8c8c8c;"
-| U+201D || ” || || || || || || style="background:#ffe4b5;" | 0xB5 || right double quotation mark ||
+| U+201D || ” || || || || || || 0xB5 || right double quotation mark ||
 |- style="color:#8c8c8c;"
-| U+201E || „ || || || || || || style="background:#ffe4b5;" | 0xA5 || double low-9 quotation mark ||
+| U+201E || „ || || || || || || 0xA5 || double low-9 quotation mark ||
 |- style="color:#8c8c8c;"
 | U+20AC || € || || || || || 0xA4 || 0xA4 || euro sign ||
 |}
+== For More ==
+* [[I18N - Character mapping]] sketches the format and location of <language>.map files.
+* [[I18N - Charset]] is the main article about TDM language use and various encodings.