Unicode charsets registered with IANA

From: Markus Scherer (markus.scherer@jtcsv.com)
Date: Mon Sep 30 2002 - 12:03:30 EDT

  • Next message: Jane Liu: "The Currency Symbol of China"

    BOCU-1 is now an IANA-registered charset: http://www.iana.org/assignments/character-sets

    I thought it might be useful and interesting to show the list of Unicode charsets that are registered:

    Charset name, MIBenum, aliases (if any *)

    UTF-7 (MIBenum 1012)
    UTF-8 (MIBenum 106)
    UTF-16 (MIBenum 1015)
    UTF-16BE (MIBenum 1013)
    UTF-16LE (MIBenum 1014)
    UTF-32 (MIBenum 1017)
    UTF-32BE (MIBenum 1018)
    UTF-32LE (MIBenum 1019)
    CESU-8 (MIBenum 1016)
    SCSU (MIBenum 1011)
    BOCU-1 (MIBenum 1020)

    There are also some (obsolete) Unicode 1.1 charsets registered:

    UNICODE-1-1-UTF-7 (MIBenum 103) csUnicode11UTF7
    UNICODE-1-1 (MIBenum 1010) csUnicode11 [this uses the UCS-2BE encoding scheme]

    This is even more obsolete:

    ISO-10646-UTF-1 (MIBenum 27) csISO10646UTF1 [Unicode 1.0? Very obsolete encoding scheme.]

    The list also contains some registrations that are
    - encoding forms, not charsets
    - repertoires, not charsets
    - subsets of Unicode, with UCS-2BE encoding scheme

    * There is always one implicit alias: "cs" + the primary charset name.

    Best regards,

    Opinions expressed here may not reflect my company's positions unless otherwise noted.

    This archive was generated by hypermail 2.1.5 : Mon Sep 30 2002 - 13:10:35 EDT