Re: Last Call: UTF-16, an encoding of ISO 10646 to Informational

From: Frank da Cruz (fdc@watsun.cc.columbia.edu)
Date: Mon Aug 16 1999 - 11:35:55 EDT


> > fwiw I fully agree with Markus; I can't see any sense in defining
> > two ways, plus a third ambiguous way, of doing the same thing,
> > especially when it's not something we want people to do anyway.
>
> Agreed. How about:
>
> UTF-16
> UTF-16-swapped
>
> Two ways, but one is obviously preferred over the other.
>
Allowing two ways and preferring one is the same as allowing two
ways with no preference at all. As soon as you allow (say) UTF-16LE
on the Internet for interchange, then every application will have
become "UTF-16LE Compliant".

Why does it seem desirable to have multiple encodings for UTF-16?
It might seem on the surface that it is "nice" to allow everything
on the Internet, because this makes it easier to create applications
that put data on the wire. But that's backwards. It's not nice,
because it makes it extremely difficult to write Internet applications
when they have to allow for many possibilities when just one would
suffice.

Granted it's not a big deal to swap bytes, but the same principal
should apply in general -- we don't need multiple encodings for the
same thing. It hampers interoperability and it makes it unreasonably
difficult to produce applications that accept data from the Internet.
There is nothing to be gained by promoting complexity when simplicity
will do the same job.

- Frank



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:51 EDT