Mongolian Rant (was: Biblical Hebrew... was: Tibetan... was: ...)

From: Kenneth Whistler (kenw@sybase.com)
Date: Fri Jun 27 2003 - 20:25:50 EDT

  • Next message: Kenneth Whistler: "Re: Biblical Hebrew (U+034F Combining Grapheme Joiner works)"

    Andrew West wrote:

    > I have to agree 100% with Peter on this. The potential fiasco with regards to
    > Mongolian Free Variation Selectors is another area where our grandchildren are
    > going to be weeping with despair if we are not careful.

    Well, I doubt that our grandchildren will be quite *that* lachrymose, but... ;-0

    > The standardized
    > variants for Mongolian were set in stone by Unicode based on an unfortunate but
    > understandable misunderstanding of the infamous TR170, and now that it is
    > apparent from Chinese and Mongolian sources that Unicode had got hold of
    > completely the wrong end of the stick (the defined standardized variants are
    > actually intended for use in isolation only, and the same MFVS that selects one
    > variant form in isolation may be used to select a completely different variant
    > within running text ... which of course it can't according to the Standardized
    > Variants document), instead of just wiping the slate clean and redefining a new
    > and consistant set of standardized variants that correspond to actual usage
    > within China and Mongolia, Unicode is determined to preserve the original
    > erroneous standardised variants come hell or high water - even though no-one has
    > ever seriously used them yet (well, the Chinese and Mongolians will go ahead and
    > do it their way whatever Unicode decides).

    The Mongolian variants are normatively defined in tables in both 10646 and in
    the Unicode Standard, but there is normative, and then there is immutable.

    Fortunately, the definition of standardized variation sequences is not
    entangled in any of the stability guarantees for normalization. As far
    as I know, neither WG2 (in its guiding Principles and Procedures document)
    nor the Unicode Consortium, at:

    http://www.unicode.org/standard/stability_policy.html

    has made any public guarantee about the immutability of these tables.
    Of course, there is the general assumption that things that are
    standardized stay stable, but we are, as Andrew implies, still in
    early days in terms of understanding all the implications for
    Mongolian implementations. So enough with the "come hell or high
    water" rhetoric in this case.

    If it turns out that the actual implementations need to define the variants
    differently than in the current tables for the MFVS standardized variation
    sequences, then we just need to define the appropriate technical
    corrigenda for the standards, and then get on with our lives. In this
    case, if the only implementations need a revision, then the standard
    should reflect the consensus practice, rather than vice versa. And in *this*
    case, there are no stability guarantees (that I know of) standing in
    the way of fixing the problem.

    The only barrier is one of getting a sufficiently clear statement of the
    implementation requirements and a sufficiently detailed specification of
    the results in front of the committees to enable them to close the loop
    on this.

    BTW, for what it's worth, I tried arguing a different interpretation of
    how the MFVS characters had to interact with Mongolian glyph selection
    than the simple x + MFVS --> fixed glyph interpretation that ended up
    in the tables, but I personally had neither the intimate knowledge
    of the Mongolian writing system nor the bandwidth to really make the
    case. That was a fight for another day.

    >
    > And before Peter suggests it, I have already suggested elsewhere that if Unicode
    > can't fix past errors, the only course might be for Unicode to deprecate the
    > MFVSs, and start again from scratch - didn't go down too well!

    Don't deprecate the MFVSs. Why? Instead, create a technical corrigendum that
    fixes the table(s) and updates the definition of their semantics
    until it matches consensus implementation in actual practice.
    Note that you are a leg up on this already because the MFVS characters
    are not submerged in the collection of generic variation selectors.
    They were defined for Mongolian for a reason, and the fact that
    they are *Mongolian* variation selectors ought to give us some
    room for making them function as intended for Mongolian.

    Nobody that I know of is deliberately *trying* to maintain mistakes
    or unimplementable features in the standard out of some perverse
    pleasure in frustrating people who are faithfully trying to implement
    Unicode.

    --Ken



    This archive was generated by hypermail 2.1.5 : Fri Jun 27 2003 - 21:04:42 EDT