Re: ASCII and Unicode lifespan

From: Alexander Kh. (
Date: Wed May 18 2005 - 21:55:16 CDT

  • Next message: Asmus Freytag: "Re: ASCII and Unicode lifespan"


    Mark wrote:

    > It iz extreemlee difikult too git peepul too caenj a wiedspred sistum, eevun
    > if it iz not az rashunul az it cuud bee, win dhu kosts uv dooeen soe or
    > veree hie. Dhu reezun dhat Yoonikoed waz suc u sukses wuz dhat dhu paen
    > kaazd bie dhee dhen-egzisteen olturnutiv, u *hyooj* muras uv kunflikteen
    > koedpaejiz, wuz soe veree muc hieur. Soe peepul kuud see u kleer benufit
    > dhat maed it wurth dhee ekspens.

    That I realize. Especially when it is Microsoft who's paying most part of the
    bill - I totally foresee that their systems will be based on what they payed
    for. However, many people still pay for traffic, and switching from local
    encoding to unicode will mean double the traffic right away. However, if using
    state-machine approach, encodings can be changed on-the-fly by using a special
    escape-code. That's one way of getting benifits of both approach, not to mention
    the fact that local encodings are more well-thought in design.

    Also, consider this idea: how about using a code for "shift" key which will reduce
    in 2 usage of code space. Capital letters are relatively rare in a sentence (unless
    it's a spam message, of course) and overload on using that additional character in
    UTF-8 would be minimal compared to the advantage of being able to store more enco-
    dings in one-byte character space. I guess I'm in a wrond list to discuss UTF-8...

    > Dhu odz uv dhat hapining in dhu foerseeubul fyoocur foer dhee inkoedeen uv
    > tekst or, at best, miniskyool. Noebudee iz goeeen too caenj too u brand-noo
    > inkoedeen bikuz "BRACKET" iz mispeld oer biblikul heebroo iz sumwut moer
    > okwurd dhan it kuud bee; it iz for too kostlee.
    > ‚ÄéMark

    That depends only on if the Mozilla developers will consider that encoding useful.
    I represent a young generation, and I still have hope in bright future. I don't
    believe that there will be many Pan-Unicode fonts anyway and using double amount
    of space for small letter sets - that's a big waste.

    Consider this example: suppose I have a bilingual database: English-Russian for
    example. I am not planning to use all the Chinese Hieroglyphs, so why would I use
    16-bit characters???

    And also, every script has its own particular properties, for example, letter ordering,
    case sensitivity, numeric systems et.c. It will be difficult to maintain all those
    special particularities of every script in a rigid standard anyway. This will result
    in big overhead, requiring huge amounts of programming and resources to map all those
    orderings and other particularities into one standard interface. The local encodings
    are aware of those particularities and are designed for a particular purpose each.
    It will be more reasonable to continue using local encodings for some applications.
    That is why i suggest to use 8-big model such as a state-machine UTF-8 with switchable
    ASCII area, by means of an escape code. And also using specially encoded "shift" button
    to reduse the need for doubling the usage of code space by including capital letters.

    Best regards,

    Alexander Kh.

    Sign-up for Ads Free at

    This archive was generated by hypermail 2.1.5 : Wed May 18 2005 - 21:56:03 CDT