Re: Encoding Personal Use Ideographs (was Re: Level of Unicode support required for various languages)

From: Ben Monroe (
Date: Thu Nov 01 2007 - 02:02:40 CST

  • Next message: Jeroen Ruigrok van der Werven: "Re: Stix beta fonts released"

    John Knightley wrote:

    > Do you have for example a passport with your name on it in this form?
    > An ID card? A birth certificate?

    No. My ID has U+9580 U+9F8D.
    As much as I have tried (and continue) to fix that, the character
    <U+2FF5 U+9580 U+9F8D> can not be entered into a computer so is
    rejected. The one official exception that I am aware of is for a
    registered seal, which may be handled manually and does not
    necessarily need to be computer processed. That is precisely why I
    asked if it would be sufficient.

    > Actually if you do a pua font that has your name in it call the font
    > BenMonroe or whatever, then distribute this font with any docuemnts,
    > then your friends will be able to read things fine. Many formats
    > retain font information. PUA points are designed for among other
    > things names and intermediate solutions. With only .notdef and your
    > name character the font file would be very small, easy to attach or
    > down load.

    I tried PUA in the past.
    Most people could not be bothered to install the font.
    Also, many people are suspicious, for good reason, of attachments.
    It's more difficult than you may imagine to get strangers to install fonts.

    At least with IDS, even though the glyph will most likely not be
    rendered as desired, the IDS component can easily be ignored and the
    reader is left with U+9580 U+9F8D, which, while not ideal, is better
    than a .notdef.

    > Even if the process was started today, your name would be in extension
    > F will become part of unicode in about ten years time. (Extension C
    > will get in in either 2008 or 2009, Extension D at least two or three
    > years after that, Extension E again two or three years, etc ). Even if
    > somehow included in Extension E that would mean a wait of 5-8 years.

    While I do appreciate the great amount of time and effort it takes to
    encode characters, even after another decade or so, there will surely
    still be unencoded characters. It is an open set. A glyph rendering
    system based on IDS (with possible extensions as needed) is probably
    the only way to cover them all. There are undoubtedly issues such as
    normalization, but surely they can be dealt with.

    Ben Monroe

    This archive was generated by hypermail 2.1.5 : Thu Nov 01 2007 - 02:04:33 CST