Re: Origin of the U+nnnn notation

From: Antoine Leca (
Date: Tue Nov 08 2005 - 12:25:24 CST

  • Next message: Kenneth Whistler: "Re: Origin of the U+nnnn notation"

    On Tuesday, November 8th, 2005 14:04Z, Philippe Verdy va escriure:
    > U-nnnn already exists (or I should say, it has existed). It was
    > refering to 16-bit code units, not really to characters and was a
    > fixed-width notation (with 4 hexadecimal digits). The "U" meant
    > "Unicode" (1.0 and before).
    > U+[n...n]nnnn was created to avoid the confusion with the past 16-bit
    > only Unicode 1.0 standard (which was not fully compatible with
    > ISO/IEC 10646 code points). It is a variable-width notation that
    > refers to ISO/IEC 10646 code points. The "U" means "UCS" or
    > "Universal Character Set". At that time, the UCS code point range was
    > up to 31 bits wide.

    Well, I did recollect there was a time, probably later than Philippe's
    description, perhaps around 1997, where the notation U+xxxx intended to
    designate 16-bit units (whether it was character or code [value] I cannot
    say), while U-xxxxxxxx intended to designate 32-bit units.

    I even found that ISO/IEC 10646-1:2000 might say so in subclause 6.5,
    according to someone writing as "Ken Whistler" in,
    at the beginning of the post. Worth reading, since it mentions others
    notations that might be standard (then) but were pretty unused, as it seems.

    I also remember asking about the introduction of the U+xxxxx and U+10xxxx
    notation, perhaps in year 2000, and to be so confirmed by Dr. Whistler;
    unfortunately my file archives are pretty bad, and I cannot found the post
    right now (well, the interessant one here is Ken's answer, not mine); I did
    not even remember if it was on this list, silly me.


    This archive was generated by hypermail 2.1.5 : Tue Nov 08 2005 - 12:28:44 CST