Re: Code pages and Unicode

From: Ken Whistler <kenw_at_sybase.com>
Date: Mon, 22 Aug 2011 16:18:56 -0700

On 8/22/2011 3:15 PM, Richard Wordingham wrote:
>> On Monday 22 August 2011, Andrew West<andrewcwest_at_gmail.com> wrote:
>> >
>>> > > Can anyone think of a way to extend UTF-16 without adding new
>>> > > surrogates or inventing a new general category?
>>> > >
>>> > > Andrew
>> >
>> > How about a triple sequence of two high surrogates followed by one
>> > low surrogate?

How about Clause 12.5 of ISO/IEC 10646:

<001B, 0025, 0040>

You "escape" out of UTF-16 to ISO 2022, and then you can do whatever the
heck you want, including exchange and processing of complete 4-byte forms,
with all the billions of characters folks seem to think they need.

Of course you would have to convince implementers to honor the ISO 2022
escape sequence and "liberate" themselves into a high-level world of
nosebleed
character numerosity. But then I guess by the time this is needed, folks are
counting on the need being self-evident. ;-)

--Ken
Received on Mon Aug 22 2011 - 18:23:15 CDT

This archive was generated by hypermail 2.2.0 : Mon Aug 22 2011 - 18:23:20 CDT