Re: 'code unit' and 'code point' meaning check

From: Mark Davis (mark.davis@jtcsv.com)
Date: Wed May 14 2003 - 18:24:13 EDT

  • Next message: Ben Dougall: "Re: 'code unit' and 'code point' meaning check"

    > (I'm sure this is an FAQ - but why are the code points 0xd800-0xdfff
    not
    > considered noncharacters? Obviously no abstract character can be
    associated
    > with them! Is there a different term that describes code points like
    this?)

    They are called surrogate code points.

    Märk Davis
    ________
    mark.davis@jtcsv.com
    IBM, MS 50-2/B11, 5600 Cottle Rd, SJ CA 95193
    (408) 256-3148
    fax: (408) 256-0799

    ----- Original Message -----
    From: "Rick Cameron" <Rick.Cameron@crystaldecisions.com>
    To: "Ben Dougall" <bend@freenet.co.uk>; <unicode@unicode.org>
    Sent: Wednesday, May 14, 2003 14:48
    Subject: RE: 'code unit' and 'code point' meaning check

    > You can find the new, improved definitions of code point and code
    unit in
    > the online draft of Chapter 3 of TUS 4.0,
    > http://www.unicode.org/book/preview/ch03.pdf
    >
    > A code point is a number between 0 and 0x10ffff. It is independent
    of the
    > encoding form.
    >
    > A code unit is the basic chunk of bits in one of the encoding forms
    of
    > Unicode - an 8-bit chunk in UTF-8, a 16-bit chunk in UTF-16 and a
    32-bit
    > chunk in UTF-32.
    >
    > (I'm sure this is an FAQ - but why are the code points 0xd800-0xdfff
    not
    > considered noncharacters? Obviously no abstract character can be
    associated
    > with them! Is there a different term that describes code points like
    this?)
    >
    > - rick
    >
    > -----Original Message-----
    > From: Ben Dougall [mailto:bend@freenet.co.uk]
    > Sent: Wednesday, 14 May 2003 13:29
    > To: unicode@unicode.org
    > Subject: 'code unit' and 'code point' meaning check
    >
    >
    > could someone confirm if i've got this correct, or not please?:
    >
    > a 'code unit' could be the same as a 'code point', but there again
    it
    > might not be. it's possible that several 'code units' are required
    to
    > make up a 'code point'? (so code units can be the same size or
    smaller
    > than a code point, but not the other way round)?
    >
    >
    >



    This archive was generated by hypermail 2.1.5 : Wed May 14 2003 - 19:10:08 EDT