On 11/7/97 11:15 AM, Kaiying Yang (email@example.com) wrote:
>Can anybody tell me what the code point of FEFF ZERO WIDTH NO-BREAK SPACE is
>designed for ? can it be used as the possible marker or word delimiter in
>text ? Presumably, plain unicode text should be able to store the
>information of this code point.
The original intent of U+FEFF was to be the "byte-order mark." Its
byte-swapped counterpart, U+FFFE is explicitly *not* a valid Unicode
character, so if you see text starting with or containing U+FEFF you know
you've got the right byte-order, and if you see text starting with or
containing U+FFFE you know you've got the wrong one and need to byte-swap
all the Unicode values you're dealing with.
As a part of the merger between Unicode 10646, it was given the
additional meaning (and name) of zero-width no-break space. Its use is
documented in The Book, p. 6-131, where it says:
"As ZERO WIDTH NO-BEAK SPACE, U+FEFF behaves like U+00A0 NO-BREAK SPACE
in that it indicates the absence of word boundaries; however, the former
has no width. For example, this character can be inserted after the
forth character in the text 'base+delta' to indicate that there should be
no line break between the 'e' and the '+'."
John H. Jenkins
This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:37 EDT