Re: The result of the plane 14 tag characters review.

From: Doug Ewell (dewell@adelphia.net)
Date: Wed Nov 13 2002 - 00:50:51 EST

  • Next message: Marco Cimarosti: "RE: The result of the plane 14 tag characters review."

    Kenneth Whistler <kenw at sybase dot com> wrote:

    > The Unicode Technical Committee would like to announce that no
    > formal decision has been taken regarding the deprecation of
    > Plane 14 language tag characters. The period for public review of
    > this issue will be extended until February 14, 2003.

    Gee, a press conference after all. Too bad my TV was turned off.

    No, seriously, thanks for the update. I'm glad to see the matter was
    considered worthy of further study. Hopefully other people who have an
    opinion on Plane 14 will contribute to the public review.

    Ken also wrote:

    > Doug's contribution would be
    > more convincing if it dropped away the irrelevancies about whether
    > the *function* of language tagging is useful and focussed completely
    > on the appropriateness of this *particular* set of characters on
    > Plane 14 as opposed to any other means of conveying the same
    > distinctions.

    That's why I included a "severability" clause, to the effect that if one
    of my arguments was bogus (or irrelevant) it shouldn't affect the
    credibility of the others.

    To answer the question "why Plane 14 plain-text instead of markup," I
    suppose I need to make the case that this meta-information is sometimes
    appropriate in short strings and labels where rich text is overkill.
    This was basically the argument put forth by the ACAP people. I did
    some homework on the MLSF proposal (a little late, I know) and saw that
    their primary perceived need was for tagging short strings in protocols
    which did not lend themselves to an additional rich-text layer.

    After seeing the MLSF tagging scheme, I agree more than ever that its
    deployment would have jeopardized the usefulness of UTF-8. Although the
    number of proposals like this to "extend" or "enhance" UTF-8 has
    diminished greatly since then, it would be a shame to see them resurface
    on the basis that "Unicode doesn't provide us any alternative."

    To me, the most difficult part of the "Save Plane 14" campaign seems to
    be convincing people that not every text problem lends itself to a
    markup solution. Without questioning the current and future importance
    of HTML and XML, there *is* text in the world that is not wrapped in one
    of these formats, and cannot be reasonably converted to them, yet still
    needs to be processed in some way.

    Judging from the discussion on the list last week, there also seems to
    be a perception that Plane 14 tags require a great deal of overhead,
    even to ignore them. I'd like to continue that discussion (especially
    since the public-review period has been extended) and ask:

    1. What extra processing is necessary to interpret Plane 14 tags that
    wouldn't be necessary to interpret any other form of tags?

    2. What extra processing is necessary to ignore Plane 14 tags that
    wouldn't be necessary to ignore any other Unicode character(s)?

    3. Is there any method of tagging, anywhere, that is lighter-weight
    than Plane 14? (Corollary: Is "lightweight" important?)

    -Doug Ewell
     Fullerton, California



    This archive was generated by hypermail 2.1.5 : Wed Nov 13 2002 - 01:38:55 EST