RE: A basic question on encoding Latin characters

From: Murray Sargent (murrays@microsoft.com)
Date: Mon Sep 27 1999 - 16:48:39 EDT


Interesting to note that the Vietnamese experts at Microsoft chose not to
use the special precomposed Vietnamese characters that were added to Unicode
because these characters aren't simply compatible with existing Vietnamese
documents and databases. Adding new precomposed characters often presents
as many problems as they purport to solve.

Murray

> -----Original Message-----
> From: Scott Horne [SMTP:shorne@metaphasetech.com]
> Sent: Monday, September 27, 1999 12:18 PM
> To: Unicode List
> Cc: Unicode List
> Subject: Re: A basic question on encoding Latin characters
>
> Michael Everson wrote:
> >
> > In fact it
> > is _easier_ to support languages (for things like matching and
> searching)
> > if you don't have to _also_ normalize between a precomposed and a
> > decomposed form.
>
> I agree. That's why it should've been all or none. If we can't have
> precomposed _ç_-overdot, we shouldn't have precomposed _ç_. If that
> means that there's no unambiguous round-trip conversion between Unicode
> and some hypothetical encoding with both _ç_ and combining cedilla, so be
> it.
>
> > Meaning that there were _technical_ reasons for drawing a line at the
> > normalization border. The line was not drawn for political or
> socioeconomic
> > reasons as you state.
>
> Have you forgotten the huge battle that was waged on this list
> (successfully, I'm glad to say) eight or nine years ago to get
> a few dozen diacritically marked Vietnamese letters added to
> the UCS?
>
> Scott Horne



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:53 EDT