Re: BOM as WJ?

From: Peter Kirk (
Date: Wed Nov 19 2003 - 19:35:36 EST

  • Next message: John Jenkins: "Re: creating a test font w/ CJKV Extension B characters."

    On 19/11/2003 16:26, Philippe Verdy wrote:

    >From: "Philippe Verdy" <>
    >>So, <NBSP,CC> must not be treated as if it was:
    >> <WJ,SP,WJ,CC>
    >>but really rather as:
    >> <WJ,SP,CC,WJ>
    >>Note here the inversion.
    >The inversion here acts as if WJ was a combining character of combining
    >class 256 (i.e. with a class higher than the combining class of all other
    >"Mn" combining characters) and a canonical reordering was applied to the
    >Of course this is not a standard normalization form, but using this pseudo
    >combining class may help render the last two coded strings (in my quote
    >above) equivalently in renderers.
    >This works even in the case where there are multiple diacritics (noted CC1
    >and CC2 below):
    > <NBSP,CC1,CC2>
    >is then treated as if it was:
    > <WJ,SP,WJ,CC1,CC2>
    >and then the pseudo-normalization had given:
    > <WJ,SP,CC1,CC2,WJ>
    > <WJ,SP,CC2,CC1,WJ>
    >(depending on the canonical reordering of CC1 and CC2, i.e. of their
    >relative combining class)
    This trick doesn't work if any of the CC's are in combining class zero.

    Peter Kirk (personal) (work)

    This archive was generated by hypermail 2.1.5 : Wed Nov 19 2003 - 20:22:15 EST