Re: Merging combining classes, was: New contribution N2676

From: John Cowan (jcowan@reutershealth.com)
Date: Wed Oct 29 2003 - 16:18:59 CST


Language Analysis Systems, Inc. Unicode list reader scripsit:

> It suggests that for many fonts,
>
> U+0067 LATIN SMALL LETTER G + U+0327 COMBINING CEDILLA
>
> and
>
> U+0067 LATIN SMALL LETTER G + U+0312 COMBINING TURNED COMMA ABOVE
>
> would have exactly the same rendering. Some applications would need to
> know this and treat U+0067 U+0327 the same as U+0067 U+0312 as
> equivalent.

There is no requirement that any given font make all characters distinguishable.
Many characters are almost always indistinguishable anyhow (A vs. Alpha vs.
Cyrillic A); many may be indistinguishable, as Latin alpha vs. Greek alpha,
or even Latin a vs. Latin alpha (which means the font is not usable for IPA,
but that's allowed). The Last Resort font makes all Greek letters
indistinguishable, all Cyrillic letters indistinguishable, etc. etc.

> I wonder if there's call for some sort of table of Unicode sequences
> that aren't canonically equivalent but render the same.

Such a thing would be highly font-dependent and variable.

-- 
"They tried to pierce your heart                John Cowan
with a Morgul-knife that remains in the         http://www.ccil.org/~cowan
wound.  If they had succeeded, you would        http://www.reutershealth.com
become a wraith under the domination of the Dark Lord."         --Gandalf


This archive was generated by hypermail 2.1.5 : Thu Jan 18 2007 - 15:54:25 CST