Re: Yet another Unihan Q (was Re: Comments on <draft...)

From: Adrian Havill (havill@threeweb.ad.jp)
Date: Fri Jun 06 1997 - 01:25:40 EDT


Not me, but Jenkins wrote:
> E.g., the official "Taiwanese" glyph for U+8349 ("grass") per ISO/IEC
> 10646 uses four strokes for the "grass" radical, whereas the PRC,
> Japanese, and Korean glyphs use three. As it happens, Apple's LiSung
> Light font for Big Five (which follows the "Taiwanese" typographic
> tradition) uses three strokes.

Kai-hsu Tai quoted and replied:
> They are not different characters! The distinction between the 3-stroke
> and the 4-stroke variants are pretty much like the two kinds of LATIN
> SMALL LETTER G's in Times New Roman and Helvetica, one somewhat looking
> like "8", the other somewhat like "9".

Yes, I know that they are the same character, but Table 6-25 in TUS 2.0
(Ideographs Unified) does not seem to mention this case... the closest
case it seems to come to is "different writing sequence," but whether or
not this includes an "increase in the amount of strokes" is unclear. The
example they give, $B<~(J, has the same amount of strokes* in both examples.
There is also no example for "variations of a radical."

Other tables use examples such as "$B<g(J" and show how the top "ten" is
bent in different directions, but is still the same character, but all
of the examples have the same total stroke count. If the 4-stroke
"grass" radical #140 (U+8279), which is a simplified form of $Bgg(J, is
considered to be a typeface variation, it should be indicated more
clearly in Table 6-25, IMO, as many of the examples could be considered
to be "typeface variations" if compared to the example of a typeface
variation given in figure 6-25.

There are cases in JIS X 0212-1990 where the four stroke grass roof
pattern is present (granted: in these cases, the radical is not $Bgg(J)

The fact that it's a very common variation/simplification, and that it's
kind of special in that the simplification involves a stroke count
change, I think it should be mentioned in Table 6-25 so as to better
clarify exactly where the boundary between "typeface" and "abstract
shape" is.

I agree with you in that they're they same character and its a typeface
variation that belongs on the "Z" axis, but it's not uncommon for people
to get upset when they insist that the proper way to write their surname
is with the four stroke U+8279, insisting that it's not the same
character and is not the same as the three stroke. So the more specific
the charts are, the better prepared TUS is for the future.

My original post was simply an attempt to clarifying with hard examples
and terminology, and clarify the specific boundaries as to what goes
where... Unicode makes exceptions for characters that are already in
existing standards even though they are technically "unified." (the
same) As CNS uses the four stroke glyph in its source, I believed that
there was a possibility that it might be included, even though the
character is "the same." I realize that the "Source Separation Rule"
applies when one source set has two characters that would normally be
unified together to preserve round-trip conversion, so this doesn't
apply, but you never know what may be added/changed (i.e. Hangul), esp.
if the "source" changes.

* Strokes/radicals based on Japanese standards, that is.

-- 
Adrian Havill <URL:http://www.threeweb.ad.jp/>
Engineering Division, System Planning & Production Section



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:34 EDT