Re: Encoding Personal Use Ideographs (was Re: Level of Unicode support required for various languages)

From: Andrew West (
Date: Wed Oct 31 2007 - 07:33:56 CST

  • Next message: "Re: Encoding Personal Use Ideographs (was Re: Level of Unicode support required for various languages)"

    On 31/10/2007, <> wrote:
    > > We do not, and hope never will, encode characters just because someone
    > > says that they use it for writing their name. And even if someone can
    > > prove that they do use a special (non-unifiable) character for writing
    > > their name it should only be encoded if it is used in a wider context
    > > than someone's personal correspondence, for example in a book or a
    > > newspaper, or at the very least in a national ID system.
    > Whilst we all know that unicode doesn't encode names, CJKV is an
    > exception to this, or at least was in the past.

    With CJKV accounting for more than 70% of Unicode, perhaps the rest of
    Unicode is the exception ;-)

    > About 10% of the 70
    > 000 or so CJKV are personnel names where even the pronunciation is
    > unsure ( in the past both Taiwan and Hong Kong operated a system
    > wereby upon registaring names of new borns, immigrants etc, the name
    > (character not pronunciation) was stored dgitally, any characters not
    > in the system were simply added. In Taiwan by law such records must be
    > maintain for nine generations. The names need to be exact, consder the
    > headline "Murderer goes free because character printed wrong".

    Indeed, there is a requirement at the national level to be able to
    represent personal use ideographs for ID systems etc., which I
    acknowledged in my message, but the request to encode <U+2FF5 U+9580
    U+9F8D> did not come from a national body, and, critically, was not
    accompanied by any supporting evidence that there is a need to encode
    the character. I don't like cutesy made-up characters, but if there is
    evidence that a character is used in the public domain (e.g. names of
    race horses) then it may well be appropriate to encode it. It's all a
    question of evidence, which in the case of Ben's character is entirely


    This archive was generated by hypermail 2.1.5 : Wed Oct 31 2007 - 07:35:45 CST