Re: Comments on BiDi in HTML/i18n draft

From: Martin J Duerst (mduerst@ifi.unizh.ch)
Date: Thu May 23 1996 - 11:39:24 EDT


>
>I have been following the progress of the HTML working group for
>some time now, especially with regards to the issues concerning
>the i18n draft by F.Yergeau, G.Nicol, G.Adams, and M.Duerst. I
>have also had the opportunity to chat with both Glen (Adams) and
>Martin (Duerst) in person regarding various aspects of the
>draft.
>
>At this point, as someone involved in the implementation of a
>Unicode Web Browser and multilingual HTML authoring tool which
>supports BiDi (Hebrew and Arabic) and many other scripts
>(www.accentsoft.com), I would like to raise some points
>regarding the i18n draft. My comments are broken up by subject.

Hello Rob - Nice to hear from you again. Your comments come
rather late, but it looks like deadlines are necessary for some
things. Anyway, I appreciate your clear list of issues and
proposals, although I don't agree on all of them :-).

>I. Language Marking
>-------------------

See my comments in the follow-up messages.

> 4. The separator between the language and "ethnologue" is a
> period in the HTML 3 draft, while it is a "dash" in the
> i18n draft. Which one is it? Both?

The authors of the HTML 3 draft were not avare of the
RFC 1766 that clearly defines to use "-".

>II. Direction of Text
>--------------------
>Here we are in basic agreement with the i18n draft regarding
>the use of the DIR attribute. We understand its use as follows:
> 1. When used in the <HTML> tag, the value specifies the
> default direction (also called reading order) of the
> document.
>
> 2. The direction of a block of text can be specified
> explicitly by using DIR as an attribute of a block
> container tag such as <P>, etc.
>
> 3. The direction of individual characters can be set by
> using DIR inside the proposed <SPAN> tag.

In asfar as you take into accont that DIR is not limited
to the tags you mention, but appears on the same tags
that LANG can apppear, for similar reasons, the above
summary is quite appropriate.

>All of this is good stuff, however we have the following items
>to add:
>
> 1. The <TABLE> tag can also accept DIR. The first cell of
> a right-to-left table (used in Hebrew, Arabic) would be
> in its upper right hand corner. The use of DIR here is
> a must.

We should probably extend the language we use for the LANG
attribute, namely that it should be added to all new elements
that can have something to do with text, to the DIR attribute
in an appropriate way.

> 3. By default, DIR="rtl" text blocks should be aligned
> right.

This is correct, but it is a browser/user interface issue.
A browser may also choose to allign all paragraphs
"justified" in the absence of any other specification.

>III. BiDi Issues
>----------------
> 1. No BiDi layout should be performed on text marked with
> the <PRE> tag.

Can you give the reason for this? Jonathan gave some rather
convincing reasons for specifying it the other way round,
but I guess you have your arguments, too.

>IV. Character Set Identification
>--------------------------------
>Here we agree on all the methods for identifying the character
>set of an HTML document, however we feel the order of preference
>for obtaining this information should be:

> 4. From the byte ordering mark in UCS-2 encoded files.
> 5. Any other hueristic for identifying character set.

4. is a very good heuristic. But I guess we don't need to
mention heuristics at all in the standard.

Many thanks again for the constructive contribution.

Regards, Martin.

----
Dr.sc.  Martin J. Du"rst			    ' , . p y f g c R l / =
Institut fu"r Informatik			     a o e U i D h T n S -
der Universita"t Zu"rich			      ; q j k x b m w v z
Winterthurerstrasse  190			     (the Dvorak keyboard)
CH-8057   Zu"rich-Irchel   Tel: +41 1 257 43 16
 S w i t z e r l a n d	   Fax: +41 1 363 00 35   Email: mduerst@ifi.unizh.ch
$@%F%e!<%k%9%H!&%^!<%F%#%s!&%d%3%V!J%A%e!<%j%C%RBg3X>pJs2J3X2J!K(J
----



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:31 EDT