Re: Mixed up priorities

From: John Cowan (cowan@locke.ccil.org)
Date: Fri Oct 22 1999 - 10:59:35 EDT


G. Adam Stanislav scripsit:

> The funny thing is that had whoever ported the Roman alphabet to the Slovak
> language decided that particular sound should be written as slashed H, or
> whatever, no one would hesitate to encode it in Unicode.

Not necessarily. There are about 14 letterforms used in Yoruba that
don't have single-codepoint Unicode representations, not because Central
European languages are inherently more deserving than Yoruba, but because
there were no legacy encodings of Yoruba text that needed (or thought
they needed) 1:1 transliteration. If this hypothetical slashed-h
had been habitually encoded as "h backspace slash" or something of
the sort, it might have wound up in Unicode as LATIN LETTER H
followed by COMBINING SLASH.

Naturally, Yoruba text *can* be encoded in Unicode correctly, just not
on a one-letter-one-codepoint basis.

-- 
John Cowan                                   cowan@ccil.org
       I am a member of a civilization. --David Brin



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:54 EDT