Re: Comment on PRI 98: IVD Adobe-Japan1 (pt.2)

From: Andrew West (
Date: Fri Mar 23 2007 - 04:59:39 CST

  • Next message: Andrew West: "Re: Encoding Pronunciation (was: Comment on PRI 98: IVD Adobe-Japan1 (pt.2))"

    On 21/03/07, Eric Muller <> wrote:
    > Andrew West wrote:
    > > Take for example the compatability ideographs U+F914, U+F95C and
    > > U+F9BF, which are all canonically equivalent to U+6A02 and which all
    > > have exactly the same glyph shape. Would it have been acceptable to
    > > represent them using variation selectors as 6A02-VS1, 6A02-VS2 and
    > > 6A02-VS3 ?
    > The case of the pronunciation variants is a bit more delicate. With
    > today's understanding of what character encoding is about, I think it's
    > fair to say that accommodating pronunciation variants in plain text is a
    > non-goal, and in fact a misguided effort, in any character standard. Can
    > you imagine having two coded characters for each ideograph used in
    > Japan, one for On reading and one for Kun reading?

    I can imagine it, but I can't imagine such a character encoding
    standard existing outside of my imagination.

    But nobody, especially not me, said anything about representing
    pronunciation variants in plain text. The compatibility ideograhs were
    encoded for roundtrip compatibility with existing standards, not so
    that pronunciation variants of ideographs could be represented in
    Unicode. It was you who suggested that variation selectors would have
    been a preferable solution than compatibility ideographs, and as most
    of the compatibility ideographs in the BMP are pronunciation variants
    I wanted to understand how and whether variation selectors could be
    used to represent non-glyphic differences.

    > > Thinking forward to Tangut,
    > I suspect it would be a hard sell today to convince the Unicode
    > community to support round-tripping with a "standard" that encodes
    > pronunciation differences.

    It will be a hard sell to get Tangutologists to migrate from the
    Mojikyo Tangut encoding to Unicode if we can't guarantee


    This archive was generated by hypermail 2.1.5 : Fri Mar 23 2007 - 05:03:12 CST