Re: Unicode, SMS, PDA/cellphones

From: Doug Ewell (
Date: Sun May 28 2006 - 18:26:28 CDT

  • Next message: Doug Ewell: "Re: Unicode, SMS, PDA/cellphones"

    Cristian Secară <orice at secarica dot ro> wrote:

    >> This sounds like an application for SCSU! The Romanian performance
    >> will take a slight hit from the distinction of comma below and
    >> cedilla in the Unicode glyph standard, [...]
    > What has this to do with the discussion here ?
    > I am discussing the GSM character set here. This happen to have a few
    > Western Latin characters in it.

    Richard was suggesting that SCSU would have been a more appropriate
    encoding for SMS than the GSM character set. It allows access to the
    full Unicode repertoire and encodes most Latin-based orthographies,
    including Romanian, much more efficiently than GSM.

    > For example, a message written with some accented characters for
    > French language (like à, è or similar) will always fall in the GSM
    > character set, so the message will consists only of 7-bit per
    > character / 160 characters per message
    > When I am entering something particular for Romanian (î for example,
    > that is U+00EE), the whole message will turn to 16-bit per character /
    > 70 characters per message, even if the remaining 99% of my message has
    > only pure ASCII characters. The use of UTF-8 have been of great sense
    > here, but for some reason this option has been left out.

    That was exactly Richard's point: this would not happen if SCSU were
    used. SCSU does have a fallback to 16-bit "Unicode mode," but primarily
    for Han, Yi, and Hangul, which generally need 16 bits anyway.

    Doug Ewell
    Fullerton, California, USA

    This archive was generated by hypermail 2.1.5 : Sun May 28 2006 - 18:35:18 CDT