Re: Arabic letters separated by markup

From: Doug Ewell (
Date: Fri Jun 10 2005 - 00:41:30 CDT

  • Next message: Jony Rosenne: "RE: Arabic letters separated by markup"

    Philippe Verdy <verdy underscore p at wanadoo dot fr> wrote:

    > Unicode sees markup in a HTML file as if it was splitting the rich
    > document into many distinct plain-text documents. What these extra
    > markup will do is also not specified.
    > So if you insert markup in the middle of a combining sequence, it is
    > no longer a single combining sequence for Unicode. Instead it will be
    > seen by Unicode as a document ending with a correct combining
    > sequence, and another document starting by a defective combining
    > sequence.

    I don't believe "Unicode sees" any of this at all. Unicode is a
    character encoding standard for plain text. If one wraps plain text in
    markup, or (as in this case) weaves the two together, it is up to the
    higher-level protocol -- the markup -- to determine how the two

    If we are really talking about how certain rendering engines from
    certain vendors display certain sequences, then that might be a markup
    issue, or it might be a vendor or implementation issue. But I don't see
    it as something over which Unicode, the character encoding standard, has
    any control.

    These are my opinions only.

    Doug Ewell
    Fullerton, California

    This archive was generated by hypermail 2.1.5 : Fri Jun 10 2005 - 00:43:21 CDT