Re: Multiple encodings for 1 character

From: David Possin (dave_i18n@yahoo.com)
Date: Mon Jul 08 2002 - 17:22:47 EDT


You will have to normalize the way the strings are processed, and you
need to make sure it is done the same way everytime. Checkout ICU for
this purpose.

http://oss.software.ibm.com/icu/

Dave
--- "Theodore H. Smith" <delete@softhome.net> wrote:
> What is going to be done about the confusion generated from
> having multiple ways to encode the same character?
>
> For example, for filenames, OSX will encode an accented Roman
> letter one way, while for filenames Windows will encode it the
> other way. These kind of confusions are totally expected, if
> Unicode will allow more than one way to encode the same
> character.
>
> This means that matching algorithm's won't work, because the
> characters are different!
>
> Will there be some kind of recommendation of which to avoid?
> Will the Unicode consortium make a standard to say that one of
> these encodings is strongly not recommended, and in fact
> depreciated?
>
> And what about the OS that uses this encoding? How will the
> Unicode consortium make the newly-offending OS change it's ways?
>
> And what about the hordes of apps that expect one format but
> don't expect the other? And the hoardes of OS independant apps
> (Java? Perl?) that might generate conflicting versions?
>
>

=====
Dave Possin
Globalization Consultant
www.Welocalize.com
http://groups.yahoo.com/group/locales/

__________________________________________________
Do You Yahoo!?
Sign up for SBC Yahoo! Dial - First Month Free
http://sbc.yahoo.com



This archive was generated by hypermail 2.1.2 : Mon Jul 08 2002 - 15:27:05 EDT