Re: Encoding Personal Use Ideographs (was Re: Level of Unicode support required for various languages)

Date: Thu Nov 01 2007 - 04:22:40 CST

  • Next message: Ed Trager: "Re: Encoding Personal Use Ideographs (was Re: Level of Unicode support required for various languages)"

    Quoting Ben Monroe <>:

    > John Knightley wrote:
    >> Do you have for example a passport with your name on it in this form?
    >> An ID card? A birth certificate?
    > No. My ID has U+9580 U+9F8D.
    > As much as I have tried (and continue) to fix that, the character
    > <U+2FF5 U+9580 U+9F8D> can not be entered into a computer so is
    > rejected. The one official exception that I am aware of is for a
    > registered seal, which may be handled manually and does not
    > necessarily need to be computer processed. That is precisely why I
    > asked if it would be sufficient.


    >> Actually if you do a pua font that has your name in it call the font
    >> BenMonroe or whatever, then distribute this font with any docuemnts,
    >> then your friends will be able to read things fine. Many formats
    >> retain font information. PUA points are designed for among other
    >> things names and intermediate solutions. With only .notdef and your
    >> name character the font file would be very small, easy to attach or
    >> down load.
    > I tried PUA in the past.
    > Most people could not be bothered to install the font.
    > Also, many people are suspicious, for good reason, of attachments.
    > It's more difficult than you may imagine to get strangers to install fonts.
    > At least with IDS, even though the glyph will most likely not be
    > rendered as desired, the IDS component can easily be ignored and the
    > reader is left with U+9580 U+9F8D, which, while not ideal, is better
    > than a .notdef.
    >> Even if the process was started today, your name would be in extension
    >> F will become part of unicode in about ten years time. (Extension C
    >> will get in in either 2008 or 2009, Extension D at least two or three
    >> years after that, Extension E again two or three years, etc ). Even if
    >> somehow included in Extension E that would mean a wait of 5-8 years.
    > While I do appreciate the great amount of time and effort it takes to
    > encode characters, even after another decade or so, there will surely
    > still be unencoded characters. It is an open set. A glyph rendering
    > system based on IDS (with possible extensions as needed) is probably
    > the only way to cover them all. There are undoubtedly issues such as
    > normalization, but surely they can be dealt with.

    There will certainly be unencoded CJKV characters in a decades time. I
    reach retirement age in 21 years time but the job will not be
    finished, even though I predict the number of encoded CJKV by then
    will be double the present 70 thousand.

    Most characters are like your name, they are non-overlapping and are
    clearly described by IDS. Parsing such characters is not too
    difficult. Though in the same way people object to installing fonts,
    they even more might object to installing software to see clearly
    someone's name.

    > Ben Monroe
    > ???

    This message sent through Virus Free Email

    This archive was generated by hypermail 2.1.5 : Thu Nov 01 2007 - 07:06:03 CST