G. Adam Stanislav <email@example.com> wrote:
> It does not matter to me whether Unicode chooses to respect
> linguistics of some languages but not of others.
This has already been covered by many list members. Unicode does not
favor certain languages over others. Unicode encodes characters, and
the characters C and H together form a single linguistic unit in Slovak
that is encoded with U+0043 and U+0048 (or lower-case equivalents).
If this remark refers yet again to the encoding of digraphs for DZ with
hacek, LJ, and NJ, this too has already been answered. Someone once
thought it was important to provide one-to-one correspondence between
Latin and Cyrillic for the language formerly known as Serbo-Croatian.
If Slovak were commonly written in both Latin and Cyrillic, you probably
*would* have seen a Latin CH digraph to match the Cyrillic U+0427. But
And if "whoever ported the Roman alphabet to the Slovak language
decided that particular sound should be written as slashed H, or whatever," I believe it *would* have been included in Unicode, for the
simple reason that it would have been included in ISO 8859-2 and other
8-bit character sets, and could not have been represented any other way.
But it wasn't.
Have the following questions been answered yet?
1. Are the letters C and H used individually in Slovak for some
purpose such that a linguistic ambiguity exists between "C as C
followed by H as H" and "C followed by H, as CH"?
2. What text processes in Slovak cannot be performed unless there
is a separate Unicode code point assigned to the letter CH?
Finally, I would venture that most of the people on the Unicode mailing
list are here because they have an interest in internationalizing
software, and know a little about the world they are trying to
internationalize software for. A comparison between these people and
those whose think Slovakia is in Yugoslavia, South America, or Siberia
is unfair, to say the least.
This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:54 EDT