Re: Unicode Encoding Illustration

From: Philippe Verdy (
Date: Fri Aug 20 2004 - 04:31:19 CDT

  • Next message: Peter Kirk: "Re: [mo/mol] and [ro/ron/rum]"

    From: "Jonathan Coxhead" <>
    > John Tisdale wrote:
    > > I've created an illustration to accompany my MSDN article to provide a
    > > high-level overview of Unicode encoding. I would appreciate any feedback
    > > related to accuracy and clarity.
    > >
    > >
    > >
    > > Thanks, John

    Additionally the labels are incorrect or misleading:
    - Add a layer above the first level showing grapheme clusters, that are made
    of one or multiple abstract characters.
    - There's a missing layer between the first (currently CCS) and second (CCF)
    box. The content of the first box is not showing a "Coded" character set,
    only examples of abstract characters, with a normative reference (U+....)
    and a representative glyph. So the top box should be labelled "The
    Unicode-ISO/IEC 10646 character repertoire" of abstract characters. The 3
    examples shown should be labelled "abstract characters", and by themselves
    they are not encoded, only named by the U+xxxx reference.- The CEF box
    should have labels "Code Unit(s)" with the optional plural.
    - Add a layer below it showing codepoints (numeric values), which is
    appropriate to represent the Unicode CCS. Label the example boxes with "Code
    point(s)" with the optional plural.
    - The bottom box should not contain the term "Code Unit" but "byte(s)". Code
    units only exist in CEF, not in CES.

    This archive was generated by hypermail 2.1.5 : Fri Aug 20 2004 - 04:37:49 CDT