Re: bidi, white space and word separator

From: Maurice J Bauhahn (
Date: Tue Jul 29 1997 - 14:23:52 EDT

I am preparing a proposal to advocate a SOFT SPACE which can be invisible
until such time as it is needed to fill out a line to full justification
in languages such as Khmer, Thai, and Lao. These all have invisible,
zero-width word breaks, but the phrase break would not only be limited to
"the white space" but according to context may also include an additional
white space (SOFT SPACE). These two types of white space would
alternatively separate some words, but not the majority, functioning
individually much like the European language comma-space pair. A SOFT
SPACE would best be inserted to allow expansion of a line to full
justification, but also should relate to some degree to the flow of
meaning. Since it does not do a very good job of the latter, it could be
algorythmically reduced to zero-width according to context or command.

So yes, white space is an important (and complicated) matter in Asian

Maurice Bauhahn

On Tue, 29 Jul 1997, Daniel Glazman wrote:

> The HTML WG of W3C is currently preparing HTML version 4.0. It has a
> section about what a white space is and clearly warns about its
> interpretation in some asiatic languages.
> White space being an important character for bidi and justification of
> text, here is my concern : in the ISO set, is there a language where
> the word separator is not the white space and is not reduced to null ?

"the white space" or "a white space"?

> Some ancient languages and writings used ":" to separate words or even
> "|"... If the answer is uncertain or positive, what is the impact on
> BIDI and general text rendering by a Unicode browser ?

This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:36 EDT