Re: UTF-9

From: John Cowan (cowan@mercury.ccil.org)
Date: Fri Oct 31 2003 - 19:26:34 CST

Next message: Helena Shih: "Common Locale Data Repository V1.0 Released!"
Previous message: Peter Constable: "RE: [hebrew] Re: Hebrew composition model, with cantillation marks"
Maybe in reply to: John Cowan: "UTF-9"
Next in thread: Philippe Verdy: "Re: UTF-9"
Reply: Philippe Verdy: "Re: UTF-9"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

Mark Crispin scripsit:

> [Read: "crazy old farts who still care about
> obsolete processors and have the temerity to think about implementing
> Unicode on them in native form."]

And their epigonoi.

> I thought about UTF-18, but I couldn't think of a good way to represent
> Unicode in 18 bits without surrogates. On the other hand, the idea to cover
> 0/1/2/14 (BMP/SMP/SIP/SSP) in a UTF-18 is interesting.

I agree, and think it makes sense.

> It would still need surrogates though. Are the D800-DFFF codepoints
> reserved in all planes or just in the BMP? I wonder if there is some
> way we could do all of ISO 10646 in UTF-18.

Only on the BMP. Planes 2. though 13., and 15. and 16., can be expressed
by surrogates. Planes above 16. have been definitively abandoned by both
ISO 10646 and Unicode, and need not be encoded.

To these proposals I would add UTF-8H and UTF-24 for the LINC, PDP-5,
PDP-8, LINC-8, and PDP-12 architectures. UTF-8H is identical to UTF-8,
except that the most significant bit of each octet is inverted. This
is intended to adapt to the convention on this architecture which encodes
ASCII in octets with the high bit set. UTF-24 represents each Unicode
scalar value in two consecutive 12-bit words, high order word first.

-- 
"But the next day there came no dawn,           John Cowan
and the Grey Company passed on into the         jcowan@reutershealth.com
darkness of the Storm of Mordor and were        http://www.ccil.org/~cowan
lost to mortal sight; but the Dead              http://reutershealth.com
followed them.          --"The Passing of the Grey Company"

Next message: Helena Shih: "Common Locale Data Repository V1.0 Released!"
Previous message: Peter Constable: "RE: [hebrew] Re: Hebrew composition model, with cantillation marks"
Maybe in reply to: John Cowan: "UTF-9"
Next in thread: Philippe Verdy: "Re: UTF-9"
Reply: Philippe Verdy: "Re: UTF-9"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.5 : Thu Jan 18 2007 - 15:54:25 CST