> What is going to be done about the confusion generated from
> having multiple ways to encode the same character?
> For example, for filenames, OSX will encode an accented Roman
> letter one way, while for filenames Windows will encode it the
> other way. These kind of confusions are totally expected, if
> Unicode will allow more than one way to encode the same
Perhaps a stray newsfeed routed via Alpha Centauri?
This is *very* old news, indeed.
> This means that matching algorithm's won't work, because the
> characters are different!
> Will there be some kind of recommendation of which to avoid?
> Will the Unicode consortium make a standard to say that one of
> these encodings is strongly not recommended, and in fact
UAX #15: Unicode Normalization Forms
And it is up to an implementation to specify which normalization
form it uses.
By the way, we don't depreciate Unicode encodings -- we appreciate
> And what about the OS that uses this encoding? How will the
> Unicode consortium make the newly-offending OS change it's ways?
It isn't offending, and the Unicode Consortium won't.
This archive was generated by hypermail 2.1.2 : Mon Jul 08 2002 - 15:26:58 EDT