ISO 10646 & GB18030 repetoire

From: Arcane Jill (arcanejill@ramonsky.com)
Date: Fri Jan 07 2005 - 03:43:36 CST

  • Next message: Andrew C. West: "GB18030 mapping (was Re: ISO 10646 compliance and EU law )"

    All sounds a bit pedantic to me. Surely /no/ applications "represent" LATIN
    SMALL LETTER A WITH ACUTE as
    U+00E1, if by "represent" you mean export the representation to the outside
    world. (The internal representation of Unicode characters within an application
    is private and opaque, and sometimes not even known to the programmer if they
    use a library which abstracts the concept).

    Instead, Unicode defines LATIN SMALL LETTER A WITH ACUTE as U+00E1, and
    applications export U+00E1 as either <0xC3 0x91> (UTF-8), <0x00E1> (UTF-16),
    <0x000000E1> (UTF-32) ... or <0xA8 0xA2> (GB18030). In which case, surely
    GB18030 is an encoding form of Unicode, just like the UTFs.

    No?

    Jill

    -----Original Message-----
    From: unicode-bounce@unicode.org [mailto:unicode-bounce@unicode.org]On
    Behalf Of Kenneth Whistler
    Sent: 06 January 2005 21:35
    To: verdy_p@wanadoo.fr
    Cc: unicode@unicode.org; kenw@sybase.com
    Subject: Re: ISO 10646 compliance and EU law

    If an application is representing LATIN SMALL LETTER A WITH ACUTE as
    <A8 A2>, then it is conforming with GB 18030-2000. (And also,
    not coincidentally, GB 2312-1980 and Microsoft Code Page 936.)

    If an application is representing LATIN SMALL LETTER A WITH ACUTE as
    U+00E1 (<0xC3 0x91>, 0x00E1, 0x000000E1, depending on encoding form),
    then it is conforming with the Unicode Standard (and ISO/IEC 10646:2003).

    If an application is mapping between the two, then it is interoperating.

    But the fact that a mapping table exists does not demonstrate that
    a Unicode application itself is conforming to the Unicode Standard.



    This archive was generated by hypermail 2.1.5 : Fri Jan 07 2005 - 03:45:52 CST