Re: Mixed up priorities

From: Kevin Bracey (kevin.bracey@pacemicro.com)
Date: Fri Oct 22 1999 - 05:27:04 EDT


In message <199910220056.RAA22990@unicode.org>
          "G. Adam Stanislav" <adam@whizkidtech.net> wrote:

> At 14:24 21-10-1999 -0700, peter_constable@sil.org wrote:
> >
> >
> > Adam:
> >
> > The entity "ch" already has an encoding in Unicode: U+0063 +
> > U+0068. It doesn't need another precomposed-form encoding,
> > unless there is something else your not telling us: please
> > specify one text process that you or other Czechs need to have
> > performed that can't be done unless there is a separate,
> > precomposed character for "ch".
>
> Argh, Peter, I am no more Czech than Michael is English! I am Slovak.
>
> Yes, we can type "ch" using the GLYPHS "c" and "h", but Unicode prides
> itself in being a character encoding, not a glyph encoding. To us, "ch" is
> a character. Period. In our dictionaries the "ch" follows the "h" and
> precedes the "i". We would never dream of looking for "ch" after "cg" and
> before "ci".
>

Different languages have different sort orders for the same characters;
this is not news. One thing I would like to know is, in Czech, is there
a distinction between your "character" "ch" and the two letters "ch"
next to each other? Or would you always interpret a c and an h next
to each other as your "ch" character? If so, you already have a totally
unambiguous encoding: "ch" = U+0063 + U+0068.

If you suddenly introduce two ways of spelling "ch", you introduce a whole
new realm of problems in searching and character set conversion. And how
are the users going to enter this composite "ch" character, anyway?

-- 
Kevin Bracey, Senior Software Engineer
Pace Micro Technology plc                     Tel: +44 (0) 1223 518566
645 Newmarket Road                            Fax: +44 (0) 1223 518526
Cambridge, CB5 8PB, United Kingdom            WWW: http://www.acorn.co.uk/



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:54 EDT