RE: Fwd: Wired 4.09 p. 130: Lost in Translation

From: Alain LaBont/e'/
Date: Thu Aug 29 1996 - 08:36:55 EDT

At 19:03 28/08/1996 -0700, Keld Simonsen wrote:
>unicode@Unicode.ORG writes:
>> Pls note that UTF-8 is a 31-bit standard (not just a 24-bit standard),
>> so it offers a variable-length byte encoding of all 10646 characters let
>> alone all Unicode characters.
>Well, to be more exact, UTF-8 is a transformation format that
>can encode all of ISO/IEC 10646 characters, ISO/IEC 10646 can
>encode 2 Gb characters (2**31 characters). UTF-8 will transform
>a 10646 character to a sequence of between 1 and *6* octets.

Btw when we made the French version of the UCS (JUC -- jeu universel de
caracthres codis sur plusieurs octets) we had a hard time understanding the
real meaning of "transformation format"... after a long discussion we called
it "format transformi" in French. Mr. Paterson (the current editor odf the
UCS) said he should and would perhaps change the English name to
"transformed format"...

However the string for ASN.1 stays as is, even if it is odd, as it is in
this case part of the technical content now (we should not use English for
identifiers, if there is a mistake in the naming, it misleads people!)

Alain LaBonti

