Re: Unicode Stability (Was: Re: E0000 Language Tags for Some Obscure Languages)

From: Kenneth Whistler (kenw@sybase.com)
Date: Fri Mar 04 2005 - 13:38:24 CST

  • Next message: Addison Phillips: "RE: Unicode Stability (Was: Re: E0000 Language Tags for Some Obscure Languages)"

    Jeroen Ruigrok van der Werven asked:

    > Given these points, wouldn't an ever-expanding standard like Unicode be a
    > cause to data bloat at one point? Since you will need continuous larger
    > encoding space to encode certain specific characters?

    The actual "data bloat" rate of the standard right now is slightly
    more than 1000 characters encoded per year. On a base of more than
    96,000 characters already encoded, that is just over 1% rate of gain
    per annum.

    I have done the calculations a number of times on this list to
    demonstrate that that rate leaves the current standard good for
    700+ years of additions without any architectural change.

    And except for a couple of foreseeable "hiccups" for CJK characters
    being sorted out now, the rate of additions will *decline*, rather
    than rise in the future, because the pool of remaining good candidates
    for encoding is dropping, and the types of outstanding candidates
    (historic scripts, oddball symbol collections that edge off into
    icons, logos, and pictures) are increasingly difficult to generate
    good proposals for and to reach clear consensus on encoding.

    >
    > Not to mention that the supporting fonts will get bigger and bigger.

    Actually not, for the most part. Most font support is segmented into
    useful subsets (by script and other criteria). Some new characters
    gradually get added to supporting fonts, but many anticipated
    additions, such as Sumero-Akkadian cuneiform (due soon) or
    Egyptian hieroglyphics (not even in ballot yet) will mostly be
    supported by specialist fonts dedicated just to them.

    > Although I have no idea how much of a problem it is given storage prices
    > nowadays.
    >
    > Anyone clued about this?

    Yep. And it isn't a serious issue, compared to the many other
    issues we deal with for the standard.

    --Ken



    This archive was generated by hypermail 2.1.5 : Fri Mar 04 2005 - 13:39:11 CST