Re: Java and Unicode

From: Jungshik Shin (jshin@pantheon.yale.edu)
Date: Wed Nov 15 2000 - 12:26:10 EST


On Wed, 15 Nov 2000, Doug Ewell wrote:

> Elliotte Rusty Harold <elharo@metalab.unc.edu> wrote:
>
> > There are a number of possibilities that don't break backwards
> > compatibility (making trans-BMP characters require two chars rather
> > than one, defining a new wchar primitive data type that is 4-bytes
> > long as well as the old 2-byte char type, etc.) but they all make the
> > language a lot less clean and obvious. In fact, they all more or less

> This is one of the great difficulties in creating a "clean" design:
> making it flexible enough so that it remains clean even in the face of
> unexpected changes (like Unicode requiring more than 16 bits).
>
> But was it really unexpected? I wonder when the Java specification was
> written -- specifically, was it before or after Unicode and JTC1/SC2/WG2
> began talking openly about moving beyond 16 bits?

That's exactly what I have in mind about Java. I can't help wondering why
Sun chose 2byte char instead of 4byte char when it was plainly obvious
that 2byte wouldn't be enough in the very near future. The same can be
said of Mozilla which internally uses BMP-only as far as I know.
Was it due to concerns over things like saving memory/storage, etc?

Jungshik Shin



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:15 EDT