Re: Unicode, SMS, PDA/cellphones

From: Richard Wordingham (richard.wordingham@ntlworld.com)
Date: Sun May 28 2006 - 14:55:04 CDT

  • Next message: Cristian Secară: "Re: Unicode, SMS, PDA/cellphones"

    Cristian Secara wrote on Sunday, May 28, 2006 at 3:08 PM
    Re: Unicode, SMS, PDA/cellphones

    > On Sun, 28 May 2006 16:31:06 +0800, Donald Z. Osborn wrote:
    >
    >> * Message length being rather shorter in Unicode SMS than with 7 or 8
    >> bit
    >
    > Usual [Latin] SMS messages are using the 7-bit GSM character set. Just
    > a few additional characters are using an escape character.
    > (ref.: http://www.csoft.co.uk/sms/character_sets/gsm.htm )
    > A single SMS message written solely using characters from the 7-bit GSM
    > character set can have maximum 160 characters. If, during SMS
    > composition, a single non-GSM character is entered, then the whole
    > message will turn to double byte, limiting a single message to maximum
    > 70 characters. I don't know if each transmitted character is direct 2
    > bytes PMB, or UTF16 transformation encoding.
    >
    > Every time I try to send a SMS message that includes accented
    > characters for my language (Romanian), I can't stop to blame those who
    > have established the SMS technical standard, because the fixed 2-bytes
    > character for Latin is pure waste of space (and money :).

    This sounds like an application for SCSU! The Romanian performance will
    take a slight hit from the distinction of comma below and cedilla in the
    Unicode glyph standard, as there will be a 2-byte overhead each to defined
    the windows for Latin Extended-A (a breve and o breve) and the high half of
    Latin Extended-B (s and t with comma below). I expect these characters
    would use 2-bytes, while Latin-1 (a and i with circumflex) would get the
    1-byte codes. ASCII characters would always be encoded as 1-byte in
    alphabetic text.

    What's happened to the old telegraphic standard for Romanian? I understand
    that used 'tz' for t with comma below.

    Richard.



    This archive was generated by hypermail 2.1.5 : Sun May 28 2006 - 15:13:35 CDT