Re: Yi in CJK Ideographs Area?

From: Kenneth Whistler (kenw@sybase.com)
Date: Mon Oct 25 1999 - 13:40:05 EDT


Tony,

>
> At 21 Oct 1999 12:24 -0700, Kenneth Whistler wrote:
> > Tony Graham asked:
> > > In Unicode 3.0, are the Yi syllables and radicals included in the CJK
> > > Ideographs Area?
> >
> > No.
> >
> > CJK Unified Ideographs: 3400..4DB5, 4E00..9FA5
> >
> > Yi: A000..A4C6
> >
> > Hangul Syllables: AC00..D7A3
>
> The Yi Syllables area isn't listed in Index-3.0.0.txt, whereas, for
> example, the CJK Ideographs Area is listed:

There is no "Yi Syllables area". The UTC is not inventing any
new areas when new scripts go in.

>
> ------------------------------------------------------------
> bash.exe-2.02$ egrep -i area * | egrep -i -e cjk -e yi
> Index-3.0.0.txt:CJK Ideographs Area 3400
> Index-3.0.0.txt:CJK Phonetics and Symbols Area 2E00
> Index-3.0.0.txt:Ideographs Area, CJK 3400
> Index-3.0.0.txt:Phonetics and Symbols Area, CJK 2E00
> Index-3.0.0.txt:Symbols Area, CJK Phonetics and 2E00
> UnicodeData-3.0.0.html: <li>Added place holders for ranges such as CJK Ideographic Area and the Private Use Area. </li>
> ------------------------------------------------------------

This is mostly the result of the legacy of this index, which has
existed since Unicode 1.0 and been gradually updated by hand
for each new edition. The inconsistent treatement of "Area" in
the index stems from the deemphasis on area as an organizing
concept -- and no one has gone through the index systematically
with the aim of correcting that part.

The use for "areas" in UnicodeData-3.0.0.html is to mark the
start and end points for assigned code points which are not
listed in the data file because their names are predictable
by algorithm (Han, Hangul), or because they have no names
(Private Use, Surrogates).

>
> It would be useful if the ends of the allocation areas were also
> listed somewhere, since the areas are not contiguous.
>
> > The concept of "Area" has no normative status in the Unicode Standard.
> > It was deemphasized in Unicode 2.0, and has been further deemphasized
> > in the text of Unicode 3.0.
>
> Organising the character block descriptions in the Unicode Standard,
> Version 2.0, by their allocation area was an interesting way to
> deemphasize the concept of "area".

Sarcastic, I presume? It was the result of just continuing the exact
same organization that was present in the explanatory chapters for
Unicode 2.0. That was the path of least resistance that got the
book published at the time.

> I note that the character block
> descriptions in the Unicode Standard, Version 3.0, will not be
> organised this way.

Correct.

--Ken



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:54 EDT