From: Mark Davis (mark.davis@jtcsv.com)
Date: Thu Jan 06 2005 - 17:36:01 CST
I agree with Ken's statement, but would qualify one bit.
> to about March 31, 2005 will contain the mappings:
>
> FE90 <--> U+E854
> 82359133 <--> U+9FBA
>
> After that time, they will contain the mappings:
>
> ???? <--> U+E854
> FE90 <--> U+9FBA
> 82359133 <--> ???? (probably U+FFFD)
The http://www.unicode.org/reports/tr22/ recommends mapping tables of the
following form to handle that situation, by changing the old cases into
one-way mappings. This provides a more graceful transition.
FE90 <-- U+E854
FE90 <--> U+9FBA
82359133 --> U+9FBA
This does not detract from the point that Ken is making.
Mark
----- Original Message -----
From: "Kenneth Whistler" <kenw@sybase.com>
To: <verdy_p@wanadoo.fr>
Cc: <unicode@unicode.org>; <kenw@sybase.com>
Sent: Thursday, January 06, 2005 12:08
Subject: Re: Re: ISO 10646 compliance and EU law
> Philippe,
>
> > Thanks for correcting this refutation by Kenneth.
>
> > So I know that both ISO/IEC 10646 and GB18030 repertoires will be
> > amended, but the current statements in the GB18030 standard is that
> > its mapping with ISO/IEC 10646 will remain closed
>
> Which is false.
>
> > and compatible with
>
> This probably will hold true.
>
> > all future amendments of ISO/IEC 10646
> > (and so, also with Unicode),
> > in way similar to the synchronization of the repertoire and assignments
> > used by Unicode. From my point of view, both Unicode and GB18030 have
> > now a similar policy to remain synchronized with the base ISO/IEC
> > 10646 character repertoire.
> > This effectively means that this statement implies If China wants to
> > standardize in GB18030 some precomposed character that are not in
> > ISO/IEC10646, this is possible only within the PUA.
>
> This is false. China can do what it wants in GB18030, and the decisions
> that they take will impact implementations, depending on how carefully
> they are synchronized or not.
>
> > GB18030 will remain fully compatible with ISO/IEC10646 and Unicode,
> > but will add a required mutual agreement about its PUA usage.
>
> False. As I will demonstrate below.
>
> > So in practice, the only extensions allowed for the GB18030
> > repertoire is within the PUAs,
>
> False.
>
> > which already have a closed mapping with ISO/IEC 10646 (and Unicode)
> > codepoints.
>
> False.
>
> > All other extensions must be first approved and standardized in
> > ISO/IEC 10646, before GB18030 can be extended with new characters
> > in its repertoire;
>
> False.
>
> > the only alternative would be that China breaks its existing policy
> > about its closed mapping between its GB18030 encoding standard and
> > ISO/10646 codepoints.
>
> It has done so in the past, and this will happen again in the future.
>
> > This would be very bad news for developers that have to support
> > GB18030 in their software, because this would mean specific solutions
> > to support GB18030, without the possibility to map it safely to
> > ISO/IEC 10646 and Unicode.
>
> At last, something indisputably true!
>
> > This would be a new nightmare for interoperability of
> > GB18030-enabled softwares and Unicode/ISO/IEC10646-enabled
> > softwares, which would mean that existing softwares that comply
> > to Unicode or ISO/IEC 10646 will no more be compatible with the
> > required GB18030 standard for China.
>
> Correct. Welcome to the wonderful world of GB18030 support for
> China.
>
> >
> > If Kenneth thinks otherwise, then he should explain why,
> > because it would be a serious problem for those that think
> > that their Unicode/ISO/IEC-10646 software will be compatible
> > with the required GB18030 standard for China.
>
> O.k.
>
> Example 1:
>
> GB 18030-2000 defines a CJK component at FE90 and maps that
> component to U+E854, because that component is not encoded
> in Unicode 3.0 or ISO/IEC 10646-1:2000.
>
> Because such PUA mappings for GB 18030-2000 have proven
> very problematical in implementations, the characters in
> question have been added to 10646 (under ballot currently
> in Amd 1 to ISO/IEC 10646:2003). This particular CJK component
> is to be encoded at U+9FBA.
>
> And this means that GB 18030 / Unicode mapping tables up
> to about March 31, 2005 will contain the mappings:
>
> FE90 <--> U+E854
> 82359133 <--> U+9FBA
>
> After that time, they will contain the mappings:
>
> ???? <--> U+E854
> FE90 <--> U+9FBA
> 82359133 <--> ???? (probably U+FFFD)
>
> Example 2:
>
> China decides to add Tibetan BrdaRten syllables to GB 18030
> and map them to PUA characters in 10646.
>
> Well, guess what -- *all* PUA code points in 10646 already
> have defined mappings to GB 18030. That means that the addition
> of the Tibetan BrdaRten syllables and definition of mappings
> will *change* those mappings, and will require changes to the
> mappings tables. The only way to avoid that would be for
> any GB 18030 additions to be defined at specific code points
> currently labelled as empty in GB 18030 but mapped to 10646
> PUA code points. For instance:
>
> TIBETAN CHARACTER KA U ==> AAA1 <--> U+E000
>
> That wouldn't change the code point mapping, but... to actually
> support the standardization of such a set of syllables in
> GB 18030, the vendor mapping tables will have to introduce,
> instead, the one-to-many mappings to actually intepret the
> Tibetan syllables as what they are, instead of PUA code points,
> so you would end up with the following entry in the mapping
> tables:
>
> AAA1 <--> <U+0F40, U+0F74>
>
> Both of these scenarios are either in the works right now, or
> will happen in the not-too-distant future.
>
> If you think the mapping tables will just stay pristine and
> unchanged forever, in the face of such changes, you are smoking
> something. The *REASON* for making such additions is either to
> enable or *force* vendors to change the tables.
>
> > I think it is extremely important that the mapping of codes
> > between GB18030 and ISO/IEC10646 stay closed, even if these
> > codes are still not all assigned to abstract characters.
>
> You can think that, but if you mean by "closed" that the mappings
> stay stable and need not be versioned as either or both of the
> standards change, then you are flat wrong. It won't happen that
> way.
>
> > It is equally important that China then avoids any attempt to
> > extend its GB18030 repertoire without first requesting and
> > getting approval in the ISO/IEC 10646 standard respertoire.
>
> It may be important, but China does not come to WG2 asking
> permission. They are a sovereign entity, and they change
> their own standards as they see fit.
>
> > This is the job of the Ideographic working group and rapporter
> > to avoid that such event will never occur, by negociating these
> > amendments with China and with ISO working group.
>
> The IRG and its rapporteur have no jurisdiction here. Sure
> its members and anyone else can get involved in the discussions
> to try to minimize the potential for damaging changes. But
> you *will not* be able to prevent changes.
>
> --Ken
>
>
>
This archive was generated by hypermail 2.1.5 : Thu Jan 06 2005 - 17:44:07 CST