Re: How many characters?

From: Kenneth Whistler (kenw@sybase.com)
Date: Tue Nov 22 2005 - 17:31:54 CST

  • Next message: Kenneth Whistler: "Re: Hebrew script in IDN"

    Otto:

    > for an introductury lecture on Unicode, I'd like to have some numbers,
    > along the line of:
    > xxxx control characters assigned
    > xxxx graphic characters assigned
    > xxxx code-ponts reserved for future standardization
    > 2048 surrogate points defined
    > xxx non-characters defined
    > xxx privat-use codepints defined
    > -------------------------------------------------------
    > 1114112 code points altogether

    Unicode 4.1:

      51644 graphic characters assigned (BMP)
         31 format control characters assigned (BMP)
         65 control characters assigned (BMP)
       6400 private use characters assigned (BMP)
       2048 surrogate code points designated (BMP)
         34 noncharacter code points designated (BMP)
       5314 reserved code points (BMP)
      45980 graphic characters assigned (supplementary planes)
     131068 private use characters assigned (supplementary planes)
         32 noncharacter code points designated (supplementary planes)
     871496 reserved code points (supplementary planes)
    ------------------------------------------------------------------
    1114112 code points altogether

    Unicode 5.0:

      51986 graphic characters assigned (BMP)
         31 format control characters assigned (BMP)
         65 control characters assigned (BMP)
       6400 private use characters assigned (BMP)
       2048 surrogate code points designated (BMP)
         34 noncharacter code points designated (BMP)
       4972 reserved code points (BMP)
      47007 graphic characters assigned (supplementary planes)
     131068 private use characters assigned (supplementary planes)
         32 noncharacter code points designated (supplementary planes)
     870469 reserved code points (supplementary planes)
    ------------------------------------------------------------------
    1114112 code points altogether

    >
    > I am surprised that I could not find this info in the FAQ
    > (I have looked in <http://www.unicode.org/faq/basic_q.html>),
    > nor in <http://www.unicode.org/versions/Unicode4.1.0/>.

    These kind of statistics have regularly been printed in
    Appendix D of the book, so you can find that online for
    Unicode 4.0. But don't miss the fact that there were several
    errors in the 4.0 table. See the 2005-August-19 entry in
    http://www.unicode.org/errata/ for complete corrections.

    >
    > Note that the number of newly assigned characters is given
    > for each version, but the total is very hard to find.
    >
    > Thank you for a timely answer.

    --Ken



    This archive was generated by hypermail 2.1.5 : Tue Nov 22 2005 - 17:33:26 CST