Re: Unicode forms for internal storage - BOCU-1 speed

From: Philippe Verdy (verdy_p@wanadoo.fr)
Date: Thu Jan 22 2004 - 16:48:50 EST

Next message: Mike Ayers: "RE: Unicode forms for internal storage - BOCU-1 speed"

Previous message: jcowan@reutershealth.com: "Re: Unicode forms for internal storage - BOCU-1 speed"
In reply to: jcowan@reutershealth.com: "Re: Unicode forms for internal storage - BOCU-1 speed"
Next in thread: Jon Hanna: "Re: Unicode forms for internal storage - BOCU-1 speed"
Reply: Jon Hanna: "Re: Unicode forms for internal storage - BOCU-1 speed"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

From: <jcowan@reutershealth.com>
To: "Philippe Verdy" <verdy_p@wanadoo.fr>
Cc: "Markus Scherer" <markus.scherer@jtcsv.com>; <unicode@unicode.org>
Sent: Thursday, January 22, 2004 10:26 PM
Subject: Re: Unicode forms for internal storage - BOCU-1 speed

> Philippe Verdy scripsit:
>
> > Is the other competing UTF-9 from Jerome Abela this one:
>
> No. Abela's version preserves all of 00-7F and A0-FF, packing all the rest
> of Unicode into sequences beginning with any of 80-9F.

Thanks for pointing this.

By the way, I don't think that there's an official reference that attributes
the acronym "UTF-9" to any of these encoding forms. I think that if "UTF-9"
is used it should be agreed by Unicode as being an official unique
representation. The other forms requiring another encoding label not
starting by "UTF" which should be reserved to encoding forms approved by
Unicode and ISO/IEC 10646.

We have already suffered in the past of the confusion caused by various
interpretation of "UTF-8" (until CESU-8 was documented, and the acronym
"UTF-8" removed from the JNI documentation for Java) and by confusions
between UTF-16/UTF-16BE/UTF-16LE/UCS2... I think then that "UTF-9" is a bad
acronym to refer to a specific unapproved (not-standard) encoding form, and
its use in this mailing list is just adding more confusion because there's
no such "UTF-9" standard until it is documented by a IETF/ISO/IEC 10646 RFC
or by Unicode.

Next message: Mike Ayers: "RE: Unicode forms for internal storage - BOCU-1 speed"
Previous message: jcowan@reutershealth.com: "Re: Unicode forms for internal storage - BOCU-1 speed"
In reply to: jcowan@reutershealth.com: "Re: Unicode forms for internal storage - BOCU-1 speed"
Next in thread: Jon Hanna: "Re: Unicode forms for internal storage - BOCU-1 speed"
Reply: Jon Hanna: "Re: Unicode forms for internal storage - BOCU-1 speed"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.5 : Thu Jan 22 2004 - 17:34:17 EST