Encoding Personal Use Ideographs (was Re: Level of Unicode support required for various languages)

From: Andrew West (andrewcwest@gmail.com)
Date: Wed Oct 31 2007 - 04:11:45 CST

  • Next message: David Starner: "Re: Level of Unicode support required for various languages"

    On 31/10/2007, John H. Jenkins <jenkins@apple.com> wrote:
    >
    > > after doing a check based on the IDS I can find no unifiable variant
    > > of
    > > <U+2FF5 U+9580 U+9F8D>. I checked twice, first after the orginal
    > > posting and again after your posting.
    >
    > Exactly. It isn't a variant of anything by any reasonable definition,
    > so it should be separately encoded. Marking it as "not to be encoded"
    > was an error.

    Strong disagreement here. The fact that the character in question is
    not unifiable with any existing character is a red herring (my
    apologies in advance to PV). In my opinion the decision not to encode
    this character was absolutely correct given the evidence provided for
    its usage:

    <quote source="L2/07-161">
    Source
    Ben Monroe on unicode@unicode.org claimed it as the
    way he wrote his Japanese surname (message ID
    <008301c1cc6c$2204aab0
    $9575e60c@ben2ahqgswn0hr>)
    </quote>

    We do not, and hope never will, encode characters just because someone
    says that they use it for writing their name. And even if someone can
    prove that they do use a special (non-unifiable) character for writing
    their name it should only be encoded if it is used in a wider context
    than someone's personal correspondence, for example in a book or a
    newspaper, or at the very least in a national ID system.

    But as John Jenkins says, this "isn't so much a rejection as a
    rejection-pending kind of thing". If Ben becomes famous enough that
    newspapers start refering to him as "<U+2FF5 U+9580 U+9F8D>.弁" then
    the character will be a suitable candidate for encoding. In meantime
    it is just a cute (albeit quite clever) personal-usage neologism. If
    we were to encode it now on Ben's word that he needs it, and he dies
    before achieving the fame that he undoubtedly deserves, Unicode will
    be lumbered ever after with a character that nobody needs.

    Andrew



    This archive was generated by hypermail 2.1.5 : Wed Oct 31 2007 - 04:15:41 CST