Questionable statement in HTML 4.0 spec

From: David Goldsmith (
Date: Mon Sep 22 1997 - 16:12:14 EDT

Hi all,

I was looking through the HTML 4.0 spec, at URL:

and came across the following passage, in section 9.2.4:

>Consider an English document containing the same text as before:
>english1 HEBREW2 english3 HEBREW4 english5 HEBREW6
>Suppose this sequence of characters is being read by a user agent from
>left-to-right (the byte stream begins with "e" and ends with "6"). The
>"e" in "english1" is to the left of "n", which is how authors tend to
>input English characters. However, the "H" in "HEBREW2" is to the left
>of "E", which may not be how authors of Hebrew create their documents.
>For example, the MIME standard ([RFC2045]) requires right-to-left
>character sequences in email to be ordered right-to-left in the byte
>stream. This conflicts with the [UNICODE] bidirectional algorithm, which
>expects Hebrew characters to be ordered left-to-right.

Now, maybe I'm missing something, but I can find no place in RFC 2045
where characters are required to be in visual order. I'm certainly not
questioning the need for bidirectional override, but the example given
here, that RFC 2045 "requires" it, seems just plain wrong. It seems like
you would only need it for the visual order variants of ISO-8859, and
that can be handled at the time of converting to Unicode.

Am I missing something?

David Goldsmith
International, Text, and Graphics Department
Apple Computer, Inc.

This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:37 EDT