Re: BOM as WJ?

From: Asmus Freytag (
Date: Fri Nov 21 2003 - 04:09:44 EST

    At 05:52 AM 11/20/2003, Philippe Verdy wrote:
    >We need a comprehensive new technical report that lists all the exceptions
    >to the general category system, as these line-breaking or word-breaking or
    >grapheme cluster breaking properties are orthogonal to the basic GC system
    >and to the combining class system.

    No we don't.

    The GC is quite limited. It can at best capture the 'primary' classification
    of a character. For many characters, esp. in category Cf all it knows is
    that the character has some behavior that could be interesting, but is silent
    on what that behavior is. The same is largely true for all the P* and Z*
    classes, where for line and word breaking, the rules are more fine grained.

    We have two UAXs that deal in detail with these two subjects. Adding a third
    UAX on top, does not solve a thing.

    The expectation that you can derive useful knowledge of text and line boundary
    detection from just GC and CC is misguided. You need additional information.


