From: Doug Ewell (email@example.com)
Date: Mon Apr 28 2003 - 11:54:01 EDT
Jane Liu <xjliu_ca at yahoo dot com> wrote:
> Correct me if I'm wrong, it seems to me, not only for this case,
> actually in general, neither Microsoft Windows nor those popular UNIX
> systems (AIX, Solaris, HP-UX) currently supply the explicit support
> of Unicode normalization at the encoding/converison level. I suspect
> this would also apply to all major databases. The bottom line would
> be "WYSIWYG = What You See Is What You Get", Right?
> If that's true, can we conclude that in order to maintain the
> transperancy and round-trip safty between application and OS, the
> application should not use normalization?
At least in this case, yes. There may be a need to preserve the
distinction between the two characters in question. There may not be
any such need, but the operating system doesn't know that and needs to
make the safest choice.
Normalization is a double-edged sword. When two glyphs look identical,
it may seem reasonable to expect them to have the same code point. (No,
I'm not talking about Latin A and Greek Α and Cyrillic А here.) But
when a document contains characters encoded with two different code
points, it may be surprising to have them folded into one code point.
Normalization, especially compatibility but also canonical, should be
> Alos, it would be nice to give the flexibility that allowing the
> application user to choose On/Off of the normalization process,
> however, this may sounds useless since the majority of those systems
> don't even care.
The majority of users don't care either.
This archive was generated by hypermail 2.1.5 : Mon Apr 28 2003 - 12:42:36 EDT