Traditional/Simplified Unification [was: Official ISO 3166 ...]

From: Glen Perkins (Glen.Perkins@nativeguide.com)
Date: Fri Dec 03 1999 - 01:40:55 EST

Next message: Asmus Freytag: "Re: RE: Multilingual Documents [was: HTML forms and UTF-8]"
Previous message: Jonathan Rosenne: "Re: Multilingual Documents [was: HTML forms and UTF-8]"
Next in thread: John Jenkins: "Re: Traditional/Simplified Unification [was: Official ISO 3166 ...]"
Maybe reply: John Jenkins: "Re: Traditional/Simplified Unification [was: Official ISO 3166 ...]"
Maybe reply: John Cowan: "Re: Traditional/Simplified Unification [was: Official ISO 3166 ...]"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

----- Original Message -----
From: John Cowan <jcowan@reutershealth.com>
To: Unicode List <unicode@unicode.org>
Sent: Thursday, December 02, 1999 2:26 PM
(Original) Subject: Re: Official ISO 3166 country codes online

> > Certainly "zh-cn" vs. "zh-tw" is commonly used to distinguish between
> > simplified and traditional; an obvious misuse but a convenient one. I
guess
> > though that with Unicode, this crutch isn't needed since I doubt that
there's
> > much (if any) ambiguity between simplified and traditional in Unicode
text.
>
> There is no unification of simplified characters with traditional ones.

Is this true?

I'll be more specific: yes, I know that there are separate codepoints for
all common simplified hanzi and their traditional counterparts, but I've
been under the impression that this was solely a (rather convenient)
consequence of the source separation rule.

What about more obscure characters, though? For example, say a given rare
character contains a radical that has been simplified, so it would have a
simplified form if ever anybody decided to write it in a "simplified
context". If that character were to be added to Unicode/10646 as a result of
IRG research, without going thru any Chinese national standard first, would
both the simplified and traditional forms be added as separate codepoints,
or would they be unified?

In other words, it was my understanding that simplified and traditional
characters *are* unified overall, theoretically, but that for the most
common of these (accounting for nearly 100% by usage) that unification is
overridden by source separation. (Meaning, of course, that they can be
considered non-unified for all practical purposes while calling them
"unified" officially.)

Am I mistaken? Was there a decision made to not unify simplified and
traditional in any case?

Glen Perkins

Next message: Asmus Freytag: "Re: RE: Multilingual Documents [was: HTML forms and UTF-8]"
Previous message: Jonathan Rosenne: "Re: Multilingual Documents [was: HTML forms and UTF-8]"
Next in thread: John Jenkins: "Re: Traditional/Simplified Unification [was: Official ISO 3166 ...]"
Maybe reply: John Jenkins: "Re: Traditional/Simplified Unification [was: Official ISO 3166 ...]"
Maybe reply: John Cowan: "Re: Traditional/Simplified Unification [was: Official ISO 3166 ...]"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:56 EDT