Re: help on the notion of Zero Width Space

From: John H. Jenkins (
Date: Fri Nov 07 1997 - 15:05:15 EST

On 11/7/97 11:15 AM, Kaiying Yang ( wrote:

>Can anybody tell me what the code point of FEFF ZERO WIDTH NO-BREAK SPACE is
>designed for ? can it be used as the possible marker or word delimiter in
>text ? Presumably, plain unicode text should be able to store the
>information of this code point.

The original intent of U+FEFF was to be the "byte-order mark." Its
byte-swapped counterpart, U+FFFE is explicitly *not* a valid Unicode
character, so if you see text starting with or containing U+FEFF you know
you've got the right byte-order, and if you see text starting with or
containing U+FFFE you know you've got the wrong one and need to byte-swap
all the Unicode values you're dealing with.

As a part of the merger between Unicode 10646, it was given the
additional meaning (and name) of zero-width no-break space. Its use is
documented in The Book, p. 6-131, where it says:

in that it indicates the absence of word boundaries; however, the former
has no width. For example, this character can be inserted after the
forth character in the text 'base+delta' to indicate that there should be
no line break between the 'e' and the '+'."

John H. Jenkins

This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:37 EDT