RE: MS Windows and Unicode 4.0 ?

From: Carl W. Brown (cbrown@xnetinc.com)
Date: Mon Dec 01 2003 - 18:47:25 EST

  • Next message: Frank Yung-Fong Tang: "Re: MS Windows and Unicode 4.0 ?"

    Philippe,

    Win2000 was released to manufacturing in 1999 and was frozen about 6 months
    before. If I remember correctly Unicode 3.0 came out after the freeze date.
    It implemented surrogate support but disabled it in the registry. I think
    it was a bad decision. With all the last min bug fixes it would have been
    easy to flip the switch but they would have changed all the Unicode testing
    and everything would have to be retested with surrogates enabled.

    Carl

    > -----Original Message-----
    >From: unicode-bounce@unicode.org [mailto:unicode-bounce@unicode.org]
    >Sent: Monday, December 01, 2003 3:03 PM
    >To: Carl W. Brown
    >Cc: Unicode@Unicode.Org
    >Subject: RE: MS Windows and Unicode 4.0 ?
    >
    >Carl W. Brown wrote:
    >> Doug writes:
    >> > You might remember that I chided Microsoft for
    >> > its definition of "Unicode" in
    >> > Windows 2000 Help, where Unicode was described
    >> > as a "16-bit standard" that was "developed between
    >> > 1988 and 1991," implying that the work was
    >> > finished. Even at the time Windows 2000 was being
    >> > developed, there was quite a bit of room for
    >> > improvement in this definition.
    >>
    >> You are right however, Unicode was officially still 16 bit when
    >> Win2000 was released to manufacturing. We though they knew about
    >> surrogates and new planes, it was not official and could have
    >> been changed.
    >
    >Oh God... Surrogates were standardized long before they started
    >being used in Unicode 3.2 for new codepoint assignments out of
    >the BMP...
    >
    >And Microsoft was already a full member of the UTC, and knew all
    >about the required support for GB18030 in P.R.China starting in
    >2000.
    >
    >Unicode 3.0.0 was released in September 1999
    >and was superseding Unicode 2.1.9 published in April 1999
    >(UTR #8 version 3.0, see
    >http://www.unicode.org/unicode/reports/tr8/).
    >
    >Note also that normalization was already published at that time
    >(see version 17.0 of UTR#15 in September 1999 at
    >http://www.unicode.org/unicode/reports/tr15/tr15-17.html)
    >
    >As well as the encoding model for surrogates
    >(see http://www.unicode.org/reports/tr17/tr17-2.html
    >dated 1998-10-14, which clearly states that the
    >range of codepoints in 0..10FFFF and already references
    >UTF-8 and UTF-16 as valid encoding forms for this range,
    >with up to 4 bytes in UTF-8, or 2 words in UTF-16).
    >
    >The character model was already known as well as the general
    >structure of Unicode to handle characters out of the BMP.
    >These new characters were not standardized magically from
    >nothing: the Han working group was actively working and the
    >GB18030 standard was already there, that clearly demonstrated
    >that mapping the required GB18030 repertoire in Unicode
    >would be unavoidable. So there were already very active
    >discussions between Unicode, ISO/IEC 10646, and Han working
    >group to integrate GB18030 within Unicode. It was clear that
    >many new characters would become necessary in Unicode 3.0.0
    >even if only Unicode 2.1.9 was published at that time.
    >
    >Microsoft must have then anticipated this by working actively
    >to experiment the proposed models. Adding immediately the correct
    >support of surrogates was then a high priority, even if a
    >complete charset mapping to Unicode was not available at
    >that time to translate between GB18030 and Unicode.
    >
    >So Windows 2000 should have had a full support of surrogates
    >immediately (and correctly handle unmatched surrogate pairs
    >as invalid sequences for use in filenames, as well as in its
    >international support libraries, simply because it was needed
    >for GB18030 support)...
    >
    >
    >__________________________________________________________________
    ><< ella for Spam Control >> has removed Spam messages and set aside
    >Newsletters for me
    >You can use it too - and it's FREE! http://www.ellaforspam.com





    This archive was generated by hypermail 2.1.5 : Mon Dec 01 2003 - 19:30:53 EST