Re: Mixed up priorities

From: Ashley Yakeley (ashley@semantic.org)
Date: Thu Oct 21 1999 - 20:19:02 EDT


At 1999-10-21 16:49, G. Adam Stanislav wrote:

>At 13:06 21-10-1999 -0700, Michael Everson wrote:
>>But you are wrong. CH is not a _character_ in any language. It is a set of
>>strings of characters (C-H, C-h, c-h) used (sorted etc.) as a _letter_ in
>>languages like Slovak, Czech, Welsh, and traditional Spanish.
>
>Respectfully, I disagree. I cannot speak for Welsh and Spanish, but in
>Slovak and Czech, CH has all characteristics of a character: It denotes a
>specific sound

...just as it does in English, though doubtless a different sound....

>which cannot be expressed in any other way.

...just as it cannot in English...

>Nor can it be separated into two sounds.

The English 'ch' can be separated into 't-sh', though 'sh' and 'th'
cannot be.

>Many other alphabets have a separate character for this sound, e.g. the chi
>in Greek, or the Cyrillic character that looks like the Roman X.

And many do not, e.g. the 'ch' in the Scottish 'loch' or German 'Bach'.

...
>It is not simply a string of characters because it cannot be separated. You
>cannot, for example, divide a word at the end of a line by following the C
>with a - and starting the next line with an H.

Neither can you in English.

>It is *not* C-H, C-h, and
>c-h. It is CH, Ch, and ch.

Just as it is in English.

>Also, ask any Slovak to tell you what the alphabet is, he will inevitably
>list a H CH I within the sequence.

That's a sorting thing, isn't it? Unicode doesn't do sorting for you, you
have to define sorting algorithms on top of it.

-- 
Ashley Yakeley, Seattle WA



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:54 EDT