From: Doug Ewell (email@example.com)
Date: Sun Nov 09 2003 - 21:19:14 EST
Peter Jacobi <peter underscore jacobi at gmx dot net> wrote:
> Now I'm wondering about Tamil LLA (U+0BB3) and
> Tamil AU Length Mark (U+0BD7). They not only have
> incidental equal shapes in the Font used for preparing
> the Unicode charts, they are also indistinguishable in
> handwritten Tamil, typewriter Tamil etc, I am told.
> So for all purposes:
> U+0B95 U+0BCC which is canonically equivalent to
> U+0B95 U+0BC7 U+0BD7
> looks exactly the same as
> U+0B95 U+0BC7 U+0BB3
These examples actually should use U+0BC6, not U+0BC7. But this doesn't
detract from Peter's point.
> Isn't that a bit odd?
Not as odd as it may seem. These two characters do look the same, and
in the days before computer processing of Tamil there may have been no
need to distinguish between them (similar to older typewriters where
lowercase L was used for digit 1). But modern processing needs tip the
balance in favor of separate encoding.
This is not unheard of in Unicode. In the Runic alphabet, U+16BD RUNIC
LETTER SHORT-TWIG-HAGALL H and U+16C2 RUNIC LETTER E have identical
glyphs, as well as identical properties. But H and E are clearly not
the same letter, and were not used in the same Runic tradition, so they
are not unified.
> Giving an analogy using Latin script,
> that would be the same as if Latin y U+0079
> in vocalic and consonantic use were
> mapped to two different Unicode
Not really. First, "they" are never considered to be two separate
letters that happen to look the same, unlike the Tamil and Runic
examples. In English and Spanish at least, "y" is well understood to
have both a vocalic and consonantal role, but it is still a single
letter that happens to wear two different hats. Second, disunifying "y"
would cause untold mapping nightmares. And third, I don't know about
you, but the line between vocalic "y" and consonantal "y" isn't clear
enough for me to know when to use one character and when the other.
This archive was generated by hypermail 2.1.5 : Sun Nov 09 2003 - 22:03:11 EST