Re: unicode v1.1 vs unicode v2.0

From: Kenneth Whistler (kenw@sybase.com)
Date: Wed Sep 19 2001 - 14:48:27 EDT


Oodi,

> Hi, I am desperately trying to figure out what are the differences between
> unicode v1.1 and v2.0. I of course understand that new characters have been
> added and that's fine. But I also understand that some character mappings
> have been changed and this could cause problems, as we have different
> clients on different versions of unicode trying to pass data to each other.
> I don't want different clients to interpret the same unicode value as
> different characters. I've searched the unicode.org website and I can't
> find a summary of changes between v1.1 and v2.0 for example. Do you know
> where I can get this information?
>

It is printed in detail in Annex D.3 of Unicode 2.0, "Changes from Unicode 1.1
to Unicode 2.0". So round yourself up a copy of 2.0.

> I found on the unicode.org website the character mapping for v2.0 and that
> for v1.1.5. I had really wanted 1.1 but that was unavailable.

The UnicodeData-1.1.5.txt file *is* Unicode 1.1.

> Anyway, I
> thought I would at least check what were the differences between version 2.0
> and 1.1.5 Well, to my amazement, while version 2 added characters in Hebrew
> and Tibetan, it looked like it removed a whole bunch of Hangul characters.
> I had expected characters added but not removed. I thought maybe, they just
> moved them to other values but according to my diff file, there were a whole
> bunch of Hangul characters that were in 1.1.5 that were not in v2.0, but
> there were no Hangul characters in v2.0 that were not also in v1.1.5 Of
> course, I don't know Hangul so is it that what was removed was not needed?
> Any information on this would be appreciated.

It is all spelled out in Annex D.3, as I said, but here is the gist of the
issue for you:

================================================================

"Areas Redefined
...

 
Hangul Syllables Area. The the [sic] Unicode Standard Version 1.0 Hangul
Area (U+3400 - U+3D2D) and the Unicode Standard, Version 1.1 Hangul
Supplementary Syllables A and B areas (U+3D2E - U+4DFF) containing 4,306
Korean Hangul syllables have been removed. In their stead in the Unicode
Standard, Version 2.0, the Hangul Syllables Area (U+AC00 - U+D7A3)
containing 11,172 Korean Hangul syllables has been added.

Characters Moved

The redefinition of the Hangul Syllables Area resulted in the movement
of previously-encoded Hangul Syllables to the new area. No other
characters have been moved in the transition from the Unicode Standard,
Version 1.1 to the Unicode Standard, Version 2.0."

================================================================

This change was the result of Amendment 5 to 10646-1:1993, also
variously known as "fixing Korean" or "the Korean fiasco", depending
on your point of view.

Note that the 11,172 Korean Hangul syllables are a strict superset
of the two sets of Korean Hangul syllables from Unicode 1.0 and
Unicode 1.1 -- none were removed; they were simply moved into the
new block. The names of Korean Hangul syllables were redefined
(see TUS, 3.0, p. 55 for the algorithm), and because of this, starting
with UnicodeData-2.0.14.txt (for Unicode 2.0), the Hangul syllables
are no longer listed explicitly in the UnicodeData file. That may
be why it appears that they were removed from the standard -- they
were not.

All Hangul syllable values from Unicode 1.1 can be safely converted
to Unicode 2.0, and I am sure Oracle has conversions for that.

No *other* characters were removed or moved between Unicode 1.1 and
Unicode 2.0.

--Ken



This archive was generated by hypermail 2.1.2 : Wed Sep 19 2001 - 13:42:01 EDT