Re: Last Call: UTF-16

From: Frank da Cruz (fdc@watsun.cc.columbia.edu)
Date: Tue Aug 17 1999 - 19:05:18 EDT


> But ultimately the solution is to make use of the universal character
> set wherever possible -- and to keep resisting the addition of more
> 8-bit character encodings that add to the legacy problem and
> that add to the registry messes.
>
Amen.

> Your example of "doesn't handle BIDI" comes down to a question of
> *if* your implementation interprets characters in the main Hebrew,
> Arabic, Syriac, or Thaana blocks of the standard, and *if* it does
> any display at all (as opposed to backend processing with no
> display component), then it *must* conform to Unicode bidirectional
> behavior, since that is part of the specified normative behavior
> of characters from those blocks.
>
This is a topic of a separate thread -- should a terminal emulator
that handles a Unicode data stream implement the BIDI algorithm? In cases
where the companion host application needs precise control of the terminal
screen, perhaps it should not.

> But UTF-16BE, UTF-16LE, and UTF-16 are already in use in
> various vendor and other protocols, and it would be nice if we could
> get the naming problem out of the way, ditch "UCS-2" for good, and agree
> on our labels.
>
Why is it bad to say "USC-2"? Somebody said this before and I didn't
understand.

Isn't the difference between UCS-2 and UTF-16 that the latter specifies
a way to access the nonzero planes in 16 bits, whereas the former does
not? So if an application only claims to handle the BMP, isn't it
dealing with UCS-2?

- Frank



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:51 EDT