From: Kenneth Whistler (firstname.lastname@example.org)
Date: Mon Aug 04 2003 - 17:59:03 EDT
Peter Kirk asked:
> >In other words, if what you need is to glue things together,
> >i.e. a zero width no-break space *function*, then use
> >U+2060. If what you need is a BOM for the encoding scheme
> >specifications, then use U+FEFF.
> >What is *discouraged*, but not prohibited, of course, is
> >using U+FEFF for a zero width no-break space *function*,
> >precisely because that interacts so confusingly with
> >the BOM.
> And what if you need a ZWNBS function for something other than gluing
> things together? For example, as a carrier for a string or line initial
> diacritical mark when no spacing is required?
This is not something sanctioned by the standard.
The carrier for a combining mark that is to display in isolation without
a base character is U+0020 SPACE. If you want to also indicate the
absence of a line break opportunity, then the carrier is U+00A0
NO-BREAK SPACE (NBSP).
Despite its name, U+FEFF ZWNBS is *NOT* a space character. It is
formally gc=Cf, not gc=Zs. It also does not have the White_Space
So "a ZWNBS function for something other than gluing things together"
is a contradiction in terms of the current definition of the standard.
The *meaning* of the "ZWNBS function" is its behavior in the
context of UAX #14, Line Breaking Properties. See the WJ Word joiner
entry (normative) of UAX #14:
> This is one of the
> suggestions for some of the Hebrew problems, but I have had no response
> to my suggestion of using U+2060, which is inappropriately named for the
> function I have in mind.
The function I think you have in mind is not isolated display of
a combining mark, but rather trying to find a mechanism for
getting around the conformance strictures of the standard, to
get a combining mark to apply to a *following* base
character, rather than to a *preceding* base character.
Trying to use U+FEFF *or* U+2060 to do this would be inappropriate.
This archive was generated by hypermail 2.1.5 : Mon Aug 04 2003 - 18:43:28 EDT