Re: Ext-B fonts updated

From: Richard Cook (rscook@socrates.berkeley.edu)
Date: Wed Oct 17 2001 - 16:42:46 EDT


James Kass wrote:
>
> Richard Cook wrote:
>
> > >
> > > > Are there any instructions for reporting errata such as the glyphs
> > > > at U+29FD7 and U+29FCE being identical?
> > > >
> > [U+29FD7] and [U+29FCE] are not identical. They are (admittedly rather
> > close) graphical variants. If you want to ID all graphical variants,
> > you've got a long row to hoe.
> >
>
> The row's long enough without mapping all the graphical variants.
>
See my comments below.

> Attached are two small gifs, 29fce and 29fd7. The glyphs used for
> these two characters on the new chart are identical, as far as I can
> tell. Can someone point out a difference?

Aha! I was looking at a bound version of 10646-2-2000-12-05 (SC2/WG2
N2309) in which the forms are not identical, but betray the variation
which causes the codepoints to be separate. It seems that the font
vendor has done some unification here ...
>
> > For an example of even closer graphical variants (some might even say
> > *exactly* identical forms), compare [U+20a37] and [U+200ae] ... which I
> > mentioned to Mr. Jenkins a few weeks ago. As he pointed out, they both
> > have T-source numbers, and were perhaps deunified because they're
> > separate in CNS 11643 ...
> >
>
> The difference between the glyphs used for U+20A37 and U+200AE on
> the chart is obvious. The two glyphs are similar but not identical.
> They are stored under different base radicals.
>
> > [U+20a37] and [U+200ae] along with [U+28443], [U+20a31] and [U+20a5f]
> > are of course all variants of [U+8fb0].
> >
>
> The variances are clear on the chart(s) and the glyphs look quite
> different in some cases. If these characters are all variants of
> U+8FB0, which is a Chinese radical (#161), shouldn't they all be
> stored under that radical?
>
As the esteemed Dr. Whistler wrote, graphical variation sometimes leads
to classificational variation ... and when variants get variously
classified, they may also wind up being variously encoded.

What we really need is a field in Unihan.txt which could be used to
unify Han graphical variants. Of course, unification is a judgement
call, some cases more open to contention than others, but I think that
on the whole such a field would be rather useful, at least as useful as
the kRSUnicode and the kRSKangXi fields.

-Richard



This archive was generated by hypermail 2.1.2 : Wed Oct 17 2001 - 17:32:37 EDT