RE: Arabic letters separated by markup

From: Jony Rosenne (
Date: Thu Jun 16 2005 - 06:44:11 CDT

  • Next message: Jukka K. Korpela: "RE: Arabic letters separated by markup"

    > -----Original Message-----
    > From:
    > [] On Behalf Of James Kass
    > Sent: Wednesday, June 15, 2005 9:54 PM
    > To: Unicode Discussion
    > Subject: Re: Arabic letters separated by markup
    > Gregg Reynolds wrote,
    > > ... It's misleading and confusing to
    > > say that multicolored text is not plaintext when in fact we
    > have no way
    > > of inferring the form of the original coded message based
    > solely on its
    > > representation.
    > >From the glossary at
    > ( )
    > "Plain Text. Computer-encoded text that consists only
    > of a sequence of code points from a given standard,
    > with no other formatting or structural information.
    > Plain text interchange is commonly used between
    > computer systems that do not share higher-level
    > protocols."
    > Colour is not currently considered an aspect of plain text, so any
    > colouring information would be mark-up/rich text.
    > See also the glossary entry for rich text, which includes this: "The
    > Unicode Standard does not address the representation of rich text."
    > We seem to be off-topic.

    Not necessarily - we are discussing the plain text aspects of rich text.
    Rich text is plain text with "decoration".

    > Inserting mark-up tags between characters which would normally
    > ligate or shape or re-position breaks the run of text.

    I think that the high level protocol, such as HTML or CSS or XML, should
    define that. My reading of HTML and CSS is that inline markup does not break
    the run. The display engine should extract the plain text of the run, apply
    the relevant Unicode algorithms such as bidi, mirroring and Arabic shaping,
    and then apply the rich text decoration to the result as best as possible.


    > This isn't
    > a problem; it's expected behaviour. Although it might be possible
    > to define some method of colouring parts of a glyph in HTML/XML,
    > by the time the authour has mastered the syntax and all of the
    > world's browsers support the syntax -- well, it would have been
    > simpler to just use graphics.
    > Best regards,
    > James Kass

    This archive was generated by hypermail 2.1.5 : Thu Jun 16 2005 - 05:49:12 CDT