Re: Directionality Standard

From: Behnam (behnam.rassi@gmail.com)
Date: Tue Dec 18 2007 - 18:56:07 CST

  • Next message: Magda Danish (Unicode): "Unicode announces a new membership level for students"

    I can think of few good things about language tag but defining the
    directionality is not one of them. And I am thinking in general terms
    and not only about excerpts of an RSS feed. Language tag can activate
    the localized features of a font for text rendition. That's my
    favourite thing. It can also provide many other customized features
    which depending on device and medium, can be very vast and
    extraordinary.
    But directionality of the text of a paragraph is defined by the
    writer and nobody else. Language tag by the way, can set the default
    of the directionality to the most common used in that language. This
    saves a lot of clicking for the writer! But the bottom line is, that
    no matter what language the writer is using, he or she should have
    access to defining the directionality of the paragraph he or she is
    writing. And this directionality must be part of the encoding of the
    text itself, and going with it, wherever it goes.
    So your argument about Azerbijani language, although valid, is not
    the angle I am looking at this issue. Can't a Hebrew site have a news
    in Hebrew, with a long quotation of the speech of an American
    politician in English in an ltr paragraph? How does that fit with the
    directionality definition based on language? Or vice versa. Or maybe
    since we didn't see a long Hebrew paragraph in an English news
    lately, it shouldn't be taken into the consideration?!
    Whether it is HTML, RSS, Plain Text or Rich Text, the directionality
    of a paragraph is part and parcel of its encoding and if it means
    that a default ltr paragraph also should carry its default
    identification, so be it.

    Behnam

    On 18-Dec-07, at 7:06 AM, Richard Ishida wrote:

    > I posted the following comment at
    > http://blogs.msdn.com/rssteam/archive/2007/05/17/reading-feeds-in-
    > right-to-l
    > eft-order.aspx
    >
    > =======================================
    > There are some problems with this approach, as I see it.
    >
    > [1] What will you do about Azerbaijani, etc.? Azerbaijani language is
    > written in both RTL (Arabic) and LTR (Cyrillic) scripts, depending
    > on where
    > you live. Basing the choice on language (az in each case) doesn't
    > scale for
    > that type of situation.
    >
    > [2] How do you cater for feeds where entries are in one language or
    > another,
    > or a mixture of both? This is quite common. Rather than declaring the
    > directionality at the feed level, you need to declare it at the
    > item level
    > and/or within an item with mixed direction text.
    >
    > [3] The language subtag should not be required to be case-
    > sensitive, since
    > this is not required in the original formats (see BCP 47).
    >
    > I think you are limited here by the lack of proper bidi handling in
    > the RSS
    > and Atom formats, so looking at language is a workaround that
    > addresses many
    > cases reasonably well. I believe you should express it this way in
    > your
    > article above - not imply that this is a perfectly fine solution.
    >
    > I hope that helps.
    > ========================================
    >
    > But it looks like someone needs to talk with the RSS and Atom
    > format people
    > and explain the issue. (and recommend that they read
    > http://www.w3.org/TR/its/).
    >
    > RI
    >
    > ============
    > Richard Ishida
    > Internationalization Lead
    > W3C (World Wide Web Consortium)
    >
    > http://www.w3.org/International/
    > http://rishida.net/blog/
    > http://rishida.net/
    >
    >
    >
    >
    >> -----Original Message-----
    >> From: unicode-bounce@unicode.org
    >> [mailto:unicode-bounce@unicode.org] On Behalf Of 'Stephane
    >> Bortzmeyer'
    >> Sent: 18 December 2007 10:27
    >> To: Behnam
    >> Cc: unicode@unicode.org
    >> Subject: Re: Directionality Standard
    >>
    >> On Tue, Dec 18, 2007 at 04:42:11AM -0500, Behnam
    >> <behnam.rassi@gmail.com> wrote a message of 67 lines which said:
    >>
    >>> But RSS readers were displaying the content in ltr format
    >>
    >> If the RSS feed (you did not provide the URL, by the way)
    >> uses the arabic script, then, indeed, these readers are wrong
    >> since arabic characters in Unicode have the bidi class
    >> "Arabic_Letter" and should be rendered RTL, without needing
    >> HTML or Atom attributes (I do not think RSS has such an attribute.)
    >>
    >>>
    >> http://blogs.msdn.com/rssteam/archive/2007/05/17/reading-feeds-in-
    >> righ
    >>> t-to-left-order.aspx
    >>
    >> IMHO, this page is wrong because it uses languages instead of
    >> scripts. There are also technical mistakes for instance they
    >> say "The language value must be in lowercase" while language
    >> tags are case-insensitive.
    >>
    >>> But I am puzzled why a directionality issue should be resolved by a
    >>> language tag in the first place,
    >>
    >> That's probably because *most* languages are written in only
    >> one script (azeri, for instance, is an exception) and
    >> therefore have only one directionality.
    >>
    >> But, you're right and they're wrong, directionality is a
    >> property of the script, not of the language.
    >>
    >>
    >



    This archive was generated by hypermail 2.1.5 : Tue Dec 18 2007 - 18:57:37 CST