Re: Mixed up priorities

From: Kevin Bracey (
Date: Fri Oct 22 1999 - 05:27:04 EDT

In message <>
          "G. Adam Stanislav" <> wrote:

> At 14:24 21-10-1999 -0700, wrote:
> >
> >
> > Adam:
> >
> > The entity "ch" already has an encoding in Unicode: U+0063 +
> > U+0068. It doesn't need another precomposed-form encoding,
> > unless there is something else your not telling us: please
> > specify one text process that you or other Czechs need to have
> > performed that can't be done unless there is a separate,
> > precomposed character for "ch".
> Argh, Peter, I am no more Czech than Michael is English! I am Slovak.
> Yes, we can type "ch" using the GLYPHS "c" and "h", but Unicode prides
> itself in being a character encoding, not a glyph encoding. To us, "ch" is
> a character. Period. In our dictionaries the "ch" follows the "h" and
> precedes the "i". We would never dream of looking for "ch" after "cg" and
> before "ci".

Different languages have different sort orders for the same characters;
this is not news. One thing I would like to know is, in Czech, is there
a distinction between your "character" "ch" and the two letters "ch"
next to each other? Or would you always interpret a c and an h next
to each other as your "ch" character? If so, you already have a totally
unambiguous encoding: "ch" = U+0063 + U+0068.

If you suddenly introduce two ways of spelling "ch", you introduce a whole
new realm of problems in searching and character set conversion. And how
are the users going to enter this composite "ch" character, anyway?

Kevin Bracey, Senior Software Engineer
Pace Micro Technology plc                     Tel: +44 (0) 1223 518566
645 Newmarket Road                            Fax: +44 (0) 1223 518526
Cambridge, CB5 8PB, United Kingdom            WWW:

This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:54 EDT