Re: A basic question on encoding Latin characters

From: Scott Horne (shorne@metaphasetech.com)
Date: Mon Sep 27 1999 - 15:12:54 EDT


Michael Everson wrote:
>
> In fact it
> is _easier_ to support languages (for things like matching and searching)
> if you don't have to _also_ normalize between a precomposed and a
> decomposed form.

I agree. That's why it should've been all or none. If we can't have
precomposed _ç_-overdot, we shouldn't have precomposed _ç_. If that
means that there's no unambiguous round-trip conversion between Unicode
and some hypothetical encoding with both _ç_ and combining cedilla, so be it.

> Meaning that there were _technical_ reasons for drawing a line at the
> normalization border. The line was not drawn for political or socioeconomic
> reasons as you state.

Have you forgotten the huge battle that was waged on this list
(successfully, I'm glad to say) eight or nine years ago to get
a few dozen diacritically marked Vietnamese letters added to
the UCS?

Scott Horne



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:53 EDT