Re: Running out of code points

From: Doug Ewell (
Date: Sun Apr 23 2006 - 08:08:42 CST

  • Next message: Tom Gewecke: "Strange Behavior by Win IE 6 displaying bad UTF-8"

    Richard Wordingham <richard dot wordingham at ntlworld dot com> wrote:

    > The worry is that the code points needed to do it well will already
    > have unnecessarily been allocated. Depending on how lone surrogates
    > are to be treated in the transitional period, one needs between 1 BMP
    > point (for 2**31 values) and one plane (for 2**30 values) to do it in
    > 4 UTF-16 code elements (i.e. 8 bytes). Is it reasonable expect UTF-16
    > to drop out of use?

    If we need to encode only those writing systems used by humans on Earth,
    there will be plenty of space left over that could later be leveraged
    into a super-surrogate mechanism for encoding these putative writing
    systems from the planet Zoog. We don't have to do it now.

    >> I hope you were not being serious.
    > It is the *Universal* character set. Or is this US hype, where
    > 'world' just means 'nationwide' and 'universal' just means global. :-)
    > As to the extra-terrestrial scripts, well, a lot depends on who does
    > the travelling.

    Suppose the dreams of centuries came to pass, and we (a) discovered life
    on other planets, (b) that had a form of markings on objects (c) that we
    could classify as writing and (d) which we decided needed to be encoded.
    We still can't even get everyone to agree to the last two points about
    Phaistos, which is at least obviously Terran.

    I share your concern about common U.S. use of "world" (as in "World
    Series," the eligible teams for which are all located in the U.S. or
    within 90 miles driving distance), but don't see how Unicode has
    perpetuated this usage.

    Doug Ewell
    Fullerton, California, USA

    This archive was generated by hypermail 2.1.5 : Sun Apr 23 2006 - 08:12:30 CST