Re: Unicode, SMS, PDA/cellphones

From: Theodore H. Smith (
Date: Sun May 28 2006 - 09:44:48 CDT

    > On Sun, 28 May 2006 16:31:06 +0800, Donald Z. Osborn wrote:
    >> * Message length being rather shorter in Unicode SMS than with 7 or 8
    >> bit
    > Usual [Latin] SMS messages are using the 7-bit GSM character set. Just
    > a few additional characters are using an escape character.
    > (ref.: )
    > A single SMS message written solely using characters from the 7-bit
    > GSM
    > character set can have maximum 160 characters. If, during SMS
    > composition, a single non-GSM character is entered, then the whole
    > message will turn to double byte, limiting a single message to maximum
    > 70 characters. I don't know if each transmitted character is direct 2
    > bytes PMB, or UTF16 transformation encoding.
    > Every time I try to send a SMS message that includes accented
    > characters for my language (Romanian), I can't stop to blame those who
    > have established the SMS technical standard, because the fixed 2-bytes
    > character for Latin is pure waste of space (and money :).

    BOCU would have been more sensible. It can usually encode codepoints
    above 256 in one byte per character, and it can represent every code

