Re: Multiple encodings for 1 character

From: Theodore H. Smith (delete@softhome.net)
Date: Mon Jul 08 2002 - 17:32:59 EDT


>> For example, for filenames, OSX will encode an accented Roman
>> letter one way, while for filenames Windows will encode it the
>> other way. These kind of confusions are totally expected, if
>> Unicode will allow more than one way to encode the same
>> character.
>
> Perhaps a stray newsfeed routed via Alpha Centauri?
> This is *very* old news, indeed.

I'm new to this, though.

>> This means that matching algorithm's won't work, because the
>> characters are different!
>>
>> Will there be some kind of recommendation of which to avoid?
>> Will the Unicode consortium make a standard to say that one of
>> these encodings is strongly not recommended, and in fact
>> depreciated?
>
> UAX #15: Unicode Normalization Forms
>
> http://www.unicode.org/unicode/reports/tr15/

Thanks.

> And it is up to an implementation to specify which normalization
> form it uses.
>
> By the way, we don't depreciate Unicode encodings -- we appreciate
> them. ;-)

Thats a shame. Simplicity is wonderful.

--
     Theodore H. Smith - Macintosh Consultant / Contractor.
     My website: <www.elfdata.com/>



This archive was generated by hypermail 2.1.2 : Mon Jul 08 2002 - 16:59:21 EDT