RE: Fwd: Wired 4.09 p. 130: Lost in Translation

From: Glen.Seeds (
Date: Thu Aug 29 1996 - 11:08:57 EDT

I'm not sure that this is quite right. I always thought that UNIVERSAL
TRANSFORMATION FORMAT meant that it was a form that anything could be
transformed TO, passed though systems that didn't understand 10646 or
UNICODE, and then transformed FROM again, with no loss. "Transformed
format" doesn't quite capture this. You may want to talk to Gary Miller
of IBM about the original intent.

>From: Alain LaBont/e'/[]
>Sent: August 29, 1996 8:36 AM
>To:; unicode@Unicode.ORG
>Subject: RE: Fwd: Wired 4.09 p. 130: Lost in Translation
>At 19:03 28/08/1996 -0700, Keld Simonsen wrote:
>>unicode@Unicode.ORG writes:
>>> Pls note that UTF-8 is a 31-bit standard (not just a 24-bit standard),
>>> so it offers a variable-length byte encoding of all 10646 characters let
>>> alone all Unicode characters.
>>Well, to be more exact, UTF-8 is a transformation format that
>>can encode all of ISO/IEC 10646 characters, ISO/IEC 10646 can
>>encode 2 Gb characters (2**31 characters). UTF-8 will transform
>>a 10646 character to a sequence of between 1 and *6* octets.
>Btw when we made the French version of the UCS (JUC -- jeu universel de
>caractères codés sur plusieurs octets) we had a hard time understanding the
>real meaning of "transformation format"... after a long discussion we called
>it "format transformé" in French. Mr. Paterson (the current editor odf the
>UCS) said he should and would perhaps change the English name to
>"transformed format"...
>However the string for ASN.1 stays as is, even if it is odd, as it is in
>this case part of the technical content now (we should not use English for
>identifiers, if there is a mistake in the naming, it misleads people!)
>Alain LaBonté

This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:31 EDT