Re: Questionable definition of Unicode

From: Jukka K. Korpela (
Date: Thu Jan 24 2008 - 11:52:46 CST

  • Next message: Marion Gunn: "Re: Questionable definition of Unicode"

    Marion Gunn wrote:

    > _ISO 10646_ is the character set, _Unicode_ its intended 'single
    > encoding scheme'.

    I would strongly recommend against using the phrase "character set" at
    all, except among highly educated (in character matters) adults.

    "'Character set' considered harmful", wrote Dan Connolly years ago, and
    the confusion has become even more serious since that. When you use the
    phrase, you can be construed as referring to a set (repertoire) of
    characters, a code for encoding characters as integers, or a method of
    encoding characters as sequences of octets, or any combination thereof.

    Logically, and pragmatically, Unicode does not need ISO 10646 as its
    basis, and vice versa. It is worth noting that they have been
    coordinated with each other, but either of them _could_ stand on its own
    quite well.

    Jukka K. Korpela ("Yucca")

    This archive was generated by hypermail 2.1.5 : Thu Jan 24 2008 - 11:55:10 CST