The Unicode Consortium Discussion Forum

The Unicode Consortium Discussion Forum

 Forum Home  Unicode Home Page Code Charts Technical Reports FAQ Pages 
It is currently Mon Nov 24, 2014 1:58 am

All times are UTC - 6 hours [ DST ]

Forum rules

Use this forum for technical discussion of UAXes 11, 14, 15, 24, 29, 31, 34, 42, and 44. Technical discussion of UTSes 6, 10, 18, 22, 39, and 46. Technical discussion of UTRs 16, 17, 20, 23, 25, 26, 33, and 36, as well as the related properties and files in the Unicode Character Database.

Post new topic Reply to topic  [ 1 post ] 
Author Message
 Post subject: Understanding East Asian Width
PostPosted: Thu Feb 11, 2010 9:55 am 
Unicode Guru

Joined: Tue Dec 01, 2009 2:49 pm
Posts: 189
In a recent query Mark Crispin asked, essentially, why wasn't the East Asian Width defined differently from the way it is defined in UAX# 11. He asked, why wasn't the EAW defined this way:
[1] Every Unicode codepoint has a "fixed-width" property of 0, 1, 2, or "not meaningful", that was constant; no fullwidth/halfwidth that switches sense based on environment.
[2] Every Unicode codepoint that represents a codepoint in a legacy charset with fixed widths has the same "fixed-width" property as that legacy character set.

Instead, UAX#11 assigns a value of "ambiguous" to a large number of characters, based on context.

The reason for the design choice in UAX#11 was that many characters would have to be duplicated otherwise. East Asian character sets contain large collections of mathematical symbols, Latin characters, both ASCII and accented, as well as Greek and Cyrillic characters.

In legacy environments these have a fixed width, but in modern use in the same countries, many of them no longer do, and, of course, data interchange would have been a nightmare with the attempt to map between fixed width and variable width clones of the same characters.

Therefore, instead of allowing the display width of characters to drive the encoding, Unicode limited duplication of characters to those cases where they had been duplicated within a legacy character set. The EAW property was then created to make available as much information about the display width of these characters in legacy environments as possible.

Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 1 post ] 

All times are UTC - 6 hours [ DST ]

Who is online

Users browsing this forum: No registered users and 1 guest

Quick-mod tools:
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Jump to:  
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group
Template made by