Re: Markup for Language (was: Re: Exemplifying apostrophes)

From: Doug Ewell (
Date: Tue May 27 2008 - 19:35:19 CDT

  • Next message: Leo Broukhis: "And on a lighter note..."

    Behnam <behnam dot rassi at gmail dot com> wrote:

    > But to clarify what I meant, if for instance I wanted to write the
    > above quoted paragraph in Hebrew, I could select the directionality
    > of the paragraph first, then type *all I'm asking is to identify this
    > as 'Hebrew'* -in Hebrew of-course.
    > It is already identified as 'rtl'. I didn't write an html markup to
    > do this. I should be able to identify it as Hebrew as well.
    > Now if I start a text with an English paragraph introducing a Kurdish
    > text, which would follow in the next paragraph, the first paragraph
    > would be defined as English and the second as rtl and Kurdish.

    Are you sure this doesn't already work for mixed English and Kurdish --
    at least in clean, straightforward cases like the one you describe --
    without the need for RLE and other directional overrides, or other
    special control characters or markup? And if it doesn't work, are you
    sure it's not because of some shortcoming in your editor or word
    processor or browser?

    Each Unicode character has a number of properties, one of which is
    directionality. A character may be strongly LTR or RTL, weakly LTR or
    RTL, neutral, and so forth. There is an entire Unicode Standard Annex
    written to describe the way this works:

    There are edge cases where you need the directional overrides, and more
    importantly there are display engines that don't understand Unicode
    bidirectionality, but in most cases it should be possible to write
    ordinary visible characters and let the display engine format them

    Doug Ewell  *  Arvada, Colorado, USA  *  RFC 4645  *  UTN #14  ˆ

    This archive was generated by hypermail 2.1.5 : Tue May 27 2008 - 19:38:25 CDT