Re: 127 strokes beyond the radical?!

From: Kenneth Whistler (kenw@sybase.com)
Date: Fri Jul 21 2000 - 20:59:58 EDT


Patrick asked:

> >Patrick Andries wrote:
> > > De : <11digitboy@bolt.com>
> > > > On page 876, the character U+6B8B is listed as being
> > > > 127 strokes beyond the radical. I'd say it's more
> > > > like 6 strokes beyond the radical.
> > >
> > > I believe it to be 5 strokes and it is already listed under
> > > radical + 5 strokes.
>
> > [Asmus] If you read the book, it's listed under 6, not 5.
>
> I stand corrected for having wrongly excluded the + 6 form. But I wonder if
> I'm, however, wrong to suggest the +5 form ? Isn't U+6B8B the last
> ideograph in the radical + 5 and radical + 6 lists on page 876 of TUS 3.0 ?
> It is true that for TUS 2.0, page 8-23, U+6B8B seems only to be listed under
> radical+6 but with a radical+5 glyph...
>
> I have checked ISO/CEI 10646-1:2000 and it looks like the +5 strokes form is
> a simplified Chinese form (G-Hanzi) [2 horizontal strokes in the "suffix"]
> and the +6 strokes form [3 horizontal strokes in the "suffix"] is used in
> Japan, Korea and Vietnam.

In the Unicode 3.0 radical/stroke index, the theory of indexing has been changed
a bit. For Unicode 2.0, the radical/stroke index only chose *one* counting of
the strokes in those instances where there were multiple counts possible.
For Unicode 3.0, both possibilities are explicitly listed in the radical/stroke
index, where applicable, so that whichever font you are using to do the
count for the strokes, you are likely to end up in the appropriate numeric
subrange to find the character quickly, without having to go through the
well-known Han character lexical lookup torture task of having to also
scan ranges above and below what you counted, just in case the lexicographer
was counting strokes differently than you were. U+6B8B is a case in point,
but there are a number of other instances scattered throughout the index.

The 127 stroke bug, *and* the double entry in the index, are the result of
the following entries in Unihan.txt:

U+6B8B kRSJapanese 78.127
U+6B8B kRSKangXi 78.5
U+6B8B kRSMerged 78.6
U+6B8B kRSUnicode 78.6

Clearly the entry for kRSJapanese is incorrect, and should be corrected to
read 78.6

--Ken
 



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:06 EDT