Multiple encodings for 1 character

From: Theodore H. Smith (delete@softhome.net)
Date: Mon Jul 08 2002 - 15:29:14 EDT


What is going to be done about the confusion generated from
having multiple ways to encode the same character?

For example, for filenames, OSX will encode an accented Roman
letter one way, while for filenames Windows will encode it the
other way. These kind of confusions are totally expected, if
Unicode will allow more than one way to encode the same
character.

This means that matching algorithm's won't work, because the
characters are different!

Will there be some kind of recommendation of which to avoid?
Will the Unicode consortium make a standard to say that one of
these encodings is strongly not recommended, and in fact
depreciated?

And what about the OS that uses this encoding? How will the
Unicode consortium make the newly-offending OS change it's ways?

And what about the hordes of apps that expect one format but
don't expect the other? And the hoardes of OS independant apps
(Java? Perl?) that might generate conflicting versions?



This archive was generated by hypermail 2.1.2 : Mon Jul 08 2002 - 15:00:53 EDT