Re: BOM as WJ?

From: Peter Kirk (peterkirk@qaya.org)
Date: Wed Nov 19 2003 - 19:35:36 EST

  • Next message: John Jenkins: "Re: creating a test font w/ CJKV Extension B characters."

    On 19/11/2003 16:26, Philippe Verdy wrote:

    >From: "Philippe Verdy" <verdy_p@wanadoo.fr>
    >
    >
    >>So, <NBSP,CC> must not be treated as if it was:
    >> <WJ,SP,WJ,CC>
    >>but really rather as:
    >> <WJ,SP,CC,WJ>
    >>Note here the inversion.
    >>
    >>
    >
    >The inversion here acts as if WJ was a combining character of combining
    >class 256 (i.e. with a class higher than the combining class of all other
    >"Mn" combining characters) and a canonical reordering was applied to the
    >sequence.
    >
    >Of course this is not a standard normalization form, but using this pseudo
    >combining class may help render the last two coded strings (in my quote
    >above) equivalently in renderers.
    >This works even in the case where there are multiple diacritics (noted CC1
    >and CC2 below):
    > <NBSP,CC1,CC2>
    >is then treated as if it was:
    > <WJ,SP,WJ,CC1,CC2>
    >and then the pseudo-normalization had given:
    > <WJ,SP,CC1,CC2,WJ>
    >or:
    > <WJ,SP,CC2,CC1,WJ>
    >(depending on the canonical reordering of CC1 and CC2, i.e. of their
    >relative combining class)
    >
    >
    >
    >
    >
    >
    This trick doesn't work if any of the CC's are in combining class zero.

    -- 
    Peter Kirk
    peter@qaya.org (personal)
    peterkirk@qaya.org (work)
    http://www.qaya.org/
    


    This archive was generated by hypermail 2.1.5 : Wed Nov 19 2003 - 20:22:15 EST