In a recent query Mark Crispin asked, essentially, why wasn't the East Asian Width defined differently from the way it is defined in
UAX# 11. He asked, why wasn't the EAW defined this way:
Quote:
[1] Every Unicode codepoint has a "fixed-width" property of 0, 1, 2, or "not meaningful", that was constant; no fullwidth/halfwidth that switches sense based on environment.
[2] Every Unicode codepoint that represents a codepoint in a legacy charset with fixed widths has the same "fixed-width" property as that legacy character set.
Instead, UAX#11 assigns a value of "ambiguous" to a large number of characters, based on context.
The reason for the design choice in UAX#11 was that many characters would have to be duplicated otherwise. East Asian character sets contain large collections of mathematical symbols, Latin characters, both ASCII and accented, as well as Greek and Cyrillic characters.
In legacy environments these have a fixed width, but in modern use in the same countries, many of them no longer do, and, of course, data interchange would have been a nightmare with the attempt to map between fixed width and variable width clones of the same characters.
Therefore, instead of allowing the display width of characters to drive the encoding, Unicode limited duplication of characters to those cases where they had been duplicated
within a legacy character set. The EAW property was then created to make available as much information about the display width of these characters in legacy environments as possible.