Re: what encoding is used in SMS (GSM Mobile) ?

From: Ken Krugler (
Date: Sun Nov 07 2004 - 15:29:09 CST

  • Next message: Playing hide and seek on the graveyards: "Re: Windows Latin1?"

    >I like to know how are SMS messages encoded.
    >What encoded is used ? UCS-2 ? UTF-8 ?
    >Possible scenarios: (1) simple ASCII text; (2) characters from ASCII
    >combined with some from ISO-8859-x; (3) all Asian text; (...)

    The three encodings that I know have been used for SMS text, at least
    with Palm OS, are:

    a. Raw text - 8 bit undefined, which usually gets treated as ISO
    8859-1, but it completely up to the recipient. Or at least that was
    my understanding.

    b. GSM - a 7-bit encoding with two unusual features: it uses 0 (NULL)
    as a valid code point, and an escape as a way of accessing a
    secondary table of code points.

    c. UCS-2.

    Depending on your locale, different approaches are taken to encoding
    messages. In China I think all messages are sent as UCS-2. In Hong
    Kong, first GSM is tried, but if that can't represent all of the
    characters then UCS-2 is the fall-back.

    -- Ken

    Ken Krugler
    TransPac Software, Inc.
    +1 530-470-9200

    This archive was generated by hypermail 2.1.5 : Sun Nov 07 2004 - 16:48:10 CST