G. Adam Stanislav scripsit:
> The funny thing is that had whoever ported the Roman alphabet to the Slovak
> language decided that particular sound should be written as slashed H, or
> whatever, no one would hesitate to encode it in Unicode.
Not necessarily. There are about 14 letterforms used in Yoruba that
don't have single-codepoint Unicode representations, not because Central
European languages are inherently more deserving than Yoruba, but because
there were no legacy encodings of Yoruba text that needed (or thought
they needed) 1:1 transliteration. If this hypothetical slashed-h
had been habitually encoded as "h backspace slash" or something of
the sort, it might have wound up in Unicode as LATIN LETTER H
followed by COMBINING SLASH.
Naturally, Yoruba text *can* be encoded in Unicode correctly, just not
on a one-letter-one-codepoint basis.
-- John Cowan email@example.com I am a member of a civilization. --David Brin
This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:54 EDT