Re: Unicode Normalization on MS-Windows

From: Doug Ewell (dewell@adelphia.net)
Date: Mon Apr 28 2003 - 11:54:01 EDT

  • Next message: Peter_Constable@sil.org: "Re: Private Use Area"

    Jane Liu <xjliu_ca at yahoo dot com> wrote:

    > Correct me if I'm wrong, it seems to me, not only for this case,
    > actually in general, neither Microsoft Windows nor those popular UNIX
    > systems (AIX, Solaris, HP-UX) currently supply the explicit support
    > of Unicode normalization at the encoding/converison level. I suspect
    > this would also apply to all major databases. The bottom line would
    > be "WYSIWYG = What You See Is What You Get", Right?
    >
    > If that's true, can we conclude that in order to maintain the
    > transperancy and round-trip safty between application and OS, the
    > application should not use normalization?

    At least in this case, yes. There may be a need to preserve the
    distinction between the two characters in question. There may not be
    any such need, but the operating system doesn't know that and needs to
    make the safest choice.

    Normalization is a double-edged sword. When two glyphs look identical,
    it may seem reasonable to expect them to have the same code point. (No,
    I'm not talking about Latin A and Greek Α and Cyrillic А here.) But
    when a document contains characters encoded with two different code
    points, it may be surprising to have them folded into one code point.
    Normalization, especially compatibility but also canonical, should be
    done carefully.

    > Alos, it would be nice to give the flexibility that allowing the
    > application user to choose On/Off of the normalization process,
    > however, this may sounds useless since the majority of those systems
    > don't even care.

    The majority of users don't care either.

    -Doug Ewell
     Fullerton, California
     http://users.adelphia.net/~dewell/



    This archive was generated by hypermail 2.1.5 : Mon Apr 28 2003 - 12:42:36 EDT