RE: The result of the plane 14 tag characters review.

From: Marco Cimarosti (marco.cimarosti@essetre.it)
Date: Wed Nov 13 2002 - 12:00:25 EST

  • Next message: Doug Ewell: "Re: The result of the plane 14 tag characters review."

    Doug Ewell wrote:
    > 1. What extra processing is necessary to interpret Plane 14 tags that
    > wouldn't be necessary to interpret any other form of tags?

    In order for the question to make sense, we should compare plain text with
    plain text and rich text with rich text.

    1.a) Take plain text: however lightweight it may be to process (or strip)
    Plane 14 tags, it is anyway heavier than "zero", which is the amount of
    "processing" that would be needed by Plane 14 tags if they did not exist, or
    which is needed if they are ignored.

    1.b) Take rich text: the processing cost of plain-text is the sum of the
    processing costs of each piece of plain-text resulting from the
    interpretation of that rich-text protocol. Any additional cost is irrelevant
    to this comparison, because it only depends on the complexity of the higher
    protocol, and because it occurs *before* the plain-text fragments are
    available for processing. E.g., the extra processing needed to parse XML
    syntax (including XML language tagging) is not to be counted as plain-text
    processing.

    > 2. What extra processing is necessary to ignore Plane 14 tags that
    > wouldn't be necessary to ignore any other Unicode character(s)?

    No extra processing would be necessary to ignore Plane 14 tags that wouldn't
    be necessary to ignore any other Unicode characters. But I fail to see the
    point of this question.

    > 3. Is there any method of tagging, anywhere, that is lighter-weight
    > than Plane 14? (Corollary: Is "lightweight" important?)

    A lighter-weight method is not having language tagging at all in plain text.
    This is appropriate in two cases:

    3.a) When you don't language tagging.

    4.b) When language tagging can be provided by a higher level protocol.

    My assumption is that plain text always falls in case (3.a), and rich text
    always falls in case (4.b). So far, I haven't seen any proof that this
    assumption is incorrect.

    _ Marco



    This archive was generated by hypermail 2.1.5 : Wed Nov 13 2002 - 12:36:12 EST