Re: Encoding Personal Use Ideographs (was Re: Level of Unicode support required for various languages)

From: Ben Monroe (
Date: Wed Oct 31 2007 - 22:21:56 CST

  • Next message: Asmus Freytag: "Re: Stix beta fonts released"

    Apologies for the related response.
    Been rather busy.

    John Knightley wrote:

    > Mr <U+2FF5 U+9580 U+9F8D> will be pleased.

    Yes, I am.
    However, I expected the following protest.

    Andrew West wrote:

    > If we were to encode it now on Ben's word that he needs it, and he dies
    > before achieving the fame that he undoubtedly deserves, Unicode will
    > be lumbered ever after with a character that nobody needs.

    I hope that you are not wishing that I pass away any time soon.

    Seriously though, out of the over a million code points, not one could
    be given for my surname?
    (Before you suggest the PUA, please read through to the end of my message.)

    There are a great deal of encoded characters that are honestly of not use to me.
    But surely they are useful to at least someone. And when that someone
    desires to communicate said character(s) with others, it becomes
    useful to many.

    In the real, present world I do in fact (hand) write name as <U+2FF5
    U+9580 U+9F8D>.
    You and others may not find it useful, but I have an immediate use for
    it that I could use on a daily basis.

    I'm not asking you to take my word.
    What kind of documents would you like?
    My resources as an individual are limited, but if it is within my
    ability, I will supply whatever I can.
    Besides personally written stuff, I have correspondences from others
    using the character.
    I believe I even have a redelivery postal notice with it.

    So, if the opposition is so great that the character will not be
    encoded, what are my options?

    1) Use an embedded graphic image
    -No longer plain text.
    -Limited to specific environments and usages

    2) PUA
    -Will need to _constantly_ distribute a custom font.
    -Said font will need to be installed by recipients.
    -Possible overlap with other installed PUA fonts.
    Quite acceptable for a publishing house to encode a document for
    publication. But hardly acceptable for daily communication such as
    e-mail with an indefinite number of people.

    3) IDS
    Quoting John Jenkins:
    -"An IDS is *not* the same thing as encoding. It should be considered
    a better-than-nothing stop-gap until something appropriate comes along
    (either an encoded character or a registered variation sequence)".
    -"Using an IDS in running text is a hack."
    -Will not render correctly in most environments.

    4) Hand write my name one way; type it another way. Certainly not
    ideal, but it basically represents the status-quo.

    Options 1-3 are all problematic. And yet option 4 is not acceptable
    either. I see no other option but to use the IDS *hack* "until
    something appropriate comes along".

    If can not even type my own name in the Universal Character Set (UCS),
    then perhaps it is more like a Semi-Universal Character Set (SUCS).

    I notice a "w" and "W" in your name.
    My understanding is that it is a digraph of <uu>.
    It is a neologism, albeit an old one.
    For the sake of discussion, suppose _hypothetically_ that double u was
    not encoded for this reason.
    I assume that you would still desire to handwrite your name with a
    digraph glyph.
    However, when you try to enter it on a computer, the best that you
    could do is "Andreuu UUest".
    And then when you try to argue that double u should be encoded, you
    are asked for official documentation supporting this stance. Except
    official documentation is limited to the currently encoded characters
    which does not include a double u digraph. Catch-22. It's quite

    Andrew West wrote:

    > [...] the request to encode <U+2FF5 U+9580 U+9F8D> did not come from a national body, and,
    > critically, was not accompanied by any supporting evidence that there is a need to encode
    > the character. I don't like cutesy made-up characters, but if there is evidence that a character
    > is used in the public domain (e.g. names of race horses) then it may well be appropriate to
    > encode it. It's all a question of evidence, which in the case of Ben's character is entirely absent.

    I suspect that any evidence that I can produce will not be sufficient
    for you. However, if you have any suggestions and it is within my
    ability, I can certainly try my best.

    In addition to what I wrote above, will a legally registered (with the
    local ward office) seal suffice? I was recently told that I could
    register my seal as is (<U+2FF5 U+9580 U+9F8D>) as it need not to be
    limited to encoding issues. I plan on doing so next July when I am
    next going there.

    Tōkyō, Japan

    PS If the above does not render correctly on your system, send
    feedback to your OS maker and / or the Unicode Consortium. There are
    no better options.

    This archive was generated by hypermail 2.1.5 : Wed Oct 31 2007 - 22:24:10 CST