Questionable definition of Unicode

From: Stephane Bortzmeyer (bortzmeyer@nic.fr)
Date: Thu Jan 24 2008 - 04:20:41 CST

  • Next message: Michael Everson: "The last speaker of Eyak has died"

    In http://www.icann.org/topics/idn/idn-glossary.htm, one can find:

    > Unicode

    > Unicode is a commonly used single encoding scheme that provides a
    > unique number for each character across a wide variety of languages
    > and scripts. The Unicode standard contains tables that list the
    > "code points" (unique numbers) for each local character
    > identified. These tables continue to expand as more and more
    > characters are digitalized.

    Is it really a good idea to define Unicode as an *encoding scheme*?
    (Specially since there are several official encoding schemes of
    Unicode and many unofficial.)

    Using http://www.unicode.org/standard/WhatIsUnicode.html may be a
    better start. I suggest "Unicode is a commonly used character set
    that..."

    To quote the glossary in the Standard:

    Character Encoding Scheme. A character encoding form plus byte serialization. There are
    seven character encoding schemes in Unicode: UTF-8, UTF-16, UTF-16BE, UTF-16LE,
    UTF-32, UTF-32BE, and UTF-32LE.



    This archive was generated by hypermail 2.1.5 : Thu Jan 24 2008 - 04:26:15 CST