Unicode Normalization on MS-Windows

From: Jane Liu (xjliu_ca@yahoo.com)
Date: Fri Apr 25 2003 - 13:55:46 EDT

  • Next message: Michael \(michka\) Kaplan: "Re: Unicode Normalization on MS-Windows"

    Dear Unicoders,

    I am using IBM ICU V1.8 for some testing on Windows 2000 and XP, I
    found when I process some CJK characters, ICU by default will
    normalize it. For example, U+FA19(神) will be replaced by U+795E
    (神). However, if I save that two characters into a file on Windows
    2000 and XP by using Notepad and select "Unicode" as the encoding, I
    don't see Notepad would do such normalization/replacement. Also, on
    Windows file system, I can also use that two characters in the
    file/folder name, and no normalization seems to be done by the OS
    either ...

    Can anyone please shed some lights on:

    1. Why Windows doesn't do normalization, and is there any ways to ask
    Windows to do it?

    2. If Windows never do normalization, how should I balance this in my
    Windows based application since I am using the ICU. I don't think
    simply turn off the normalization process in the ICU would be a good
    idea though, however, if I keep to use ICU to normalize everything in
    my application, then I will possible run into some troubles when
    dealing with the Windows system ...



    Do you Yahoo!?
    The New Yahoo! Search - Faster. Easier. Bingo

    This archive was generated by hypermail 2.1.5 : Fri Apr 25 2003 - 14:40:22 EDT