Date: Thu Aug 01 2002 - 01:01:25 EDT

>From: Lars Marius Garshol <>

>This reminds me: does anyone have any pointers to information on how
>to convert visually encoded text (especially HTML, but also other
>formats) to Unicode?

There are programs that do it on the fly for Hebrew. The best, which I have
used myself, is HebTML, available for free downloading from . The author has been working with me on testing a
new version that supports Unicode. However, I use this app much less than
before, because Hebrew Internet is rapidly making the transition from visual
to logical ordering. With IE 5.x and Mozilla supporting logical Hebrew, the
years-old visual order is on the way out.

The conversion of visual to logical text in BiDi scripts is straightforward:
validate the BiDi property of the character, and if RTL then reverse. That
means Hebrew letters reverse their order, digits and Latin letters stay the
same. Things get more complicated, however, when hyphens, paired punctuation
and telephone numbers appear. You need a smart converter for that.

In essence, visually ordered Hebrew is a kludge for supporting Hebrew on
platforms that weren't designed for it. In other words, it is an adaptation
of Hebrew text to monodirectional LTR platforms. In modern software the onus
of directionality passes on to software.


Shlomi Tal שלומי טל

