Re: Yet another Unihan Q

From: Martin J. Duerst (
Date: Fri Jun 06 1997 - 05:20:08 EDT

On Fri, 6 Jun 1997, Kai-hsu Tai wrote:

> Adrian Havill wrote,
> > Not me, but Jenkins wrote:
> > > E.g., the official "Taiwanese" glyph for U+8349 ("grass") per ISO/IEC
> > > 10646 uses four strokes for the "grass" radical, whereas the PRC,
> > > Japanese, and Korean glyphs use three. As it happens, Apple's LiSung
> > > Light font for Big Five (which follows the "Taiwanese" typographic
> > > tradition) uses three strokes.

> > The fact that it's a very common variation/simplification, and that it's
> > kind of special in that the simplification involves a stroke count
> > change, I think it should be mentioned in Table 6-25 so as to better
> > clarify exactly where the boundary between "typeface" and "abstract
> > shape" is.
> Granted, this should be addressed if we want to be technically rigorous.

Table 6-25 and the related text is only a short summary of the
unification rules. There is more documentation on it for example
in the explanatory part of JIS 221 (Japanese version of ISO 10646).
There, the grass radical is listed as an example of components
unified because they were considered to be glyph variants.

It's difficult to capture the "feeling" people have about such
cases in a technically rigorous maner.

> > I agree with you in that they're they same character and its a typeface
> > variation that belongs on the "Z" axis, but it's not uncommon for people
> > to get upset when they insist that the proper way to write their surname
> > is with the four stroke U+8279, insisting that it's not the same
> > character and is not the same as the three stroke. So the more specific
> > the charts are, the better prepared TUS is for the future.
> In my 18 years of living in Taiwan, I have never heard anyone (except
> primary school teachers) insisting on having their names written with the
> 4-stroke "grass".

That's a good way to put it. Besides primary school teachers, there
seems to be another group, namely computer scientists with bad memories
of school. The problem is that to learn kanji, it's very important to
always use the same stroke sequence. Otherwise, you remain on the level
of drawing, you never get to writing. But that doesn't mean that all
people have to use the same sequence and other details for the rest of
their lifes, or that all fonts have to look exactly the same.
Even for school, there is more variance allowed than one usually
thinks. I have a handbook for Japanese school teachers that has
a page for each character, from first to sixth grade, and it's
amazing how many variants it lists as being acceptable.

> A similar issue was brought up more common though:
> regarding the character for "yellow" (U+9EC3 or U+9EC4), the Ministry of
> Education on Taiwan insists that U+9EC3 should be used, while most people
> having that surname (Mandarin: "Huang") writes U+9EC4, with a "radical"
> similar to "grass". But it's not even listed under the "grass"
> radical--it's got a radical entry all its own, with 8 followers.

It's not under that radical because it has nothing to do with grass
(not that yellow is not related to grass, but that the origin of
this shape is completely different, namely the tip of a flame).

> So, pertaining to Unicode, here is the issue. If U+9EC3 and U+9EC4 are
> listed at different codepoints, shouldn't all the followers (U+9EC5 to
> U+9ECC) have duplicate codepoints too? Or is this because that the source
> character sets didn't have the unlisted counterparts? I haven't got time
> to look at these characters in the TUS CD, and don't care enough either.

It's a T source set separation issue. Again, JIS 221 is very
informative here.

Regards, Martin.

