L2/01-317

**From:** Michel Suignard [michelsu@microsoft.com]

**Sent:** Tuesday, August 14, 2001 1:23 PM

**To:** Winkler, Arnold F

**Subject:** Bracket Disunification & Normalization

August 14th 2001

This is my paper on the bracket disunification and complements the paper WG2 N2345 that was presented at the last WG2 meeting and also clarifies the issues that were brillantly presently by Ken Whistler in his email titled 'Bracket Disunification & Normalization Hell'.

The issue arises mostly from the situation that characters encoded in the range 3000-303F (CJK Symbols and Punctuation) have been used historically for CJK processing, mainly for parenthetical notation. Their origin goes back to terminal display with fixed cell width, and to match the surrounding JK characters their advance width was made 'wide'. This has resulted in very precise typographic guidelines for these characters that are followed by major fonts available in these market.

Although these characters have some glyphic similarities with mathematical characters, they are not intended to be used for that purpose. Their character metric are fundamentally different:

1) They are typically full width character (EM size)

2) They have either a preceding or a following blank space (to emphasize their
parenthetical nature), and that blank space can be adjusted in text compression
expansion or even simple kerning (without text justification)

3) they use a centered baseline (instead of the low alphabetical baseline
used for most other symbols)

4) they participate in their very precise way in vertical writing. Unlike most
CJK character, they don't stay upright, but go 'sideway', using an alternate
glyph with the blank space located appropriatedly to allow typical blank space
management.

For these reasons, these characters are not compatibility characters. Mathematical characters and CJK symbols cannot be canonicalized into each other. They are basically addressing different needs. Furthermore, as pointed by Ken, the logical wide to narrow decomposition (that would parellel the full width character decomposition) cannot be used anymore as it would break normalization.

(It could be argued that some existing wide to narrow decomposition are sometimes used out of bound as the usage of both forms is very different, but this is a debate that is too late to open!)

The author has no strong preference for the addition of extra CJK symbols for the 'double left/right parenthesis'. One set has to be encoded in the Miscellaneous Mathematical Block (29xx range) for mathematical use. A CJK version is probaly required but further study in the expected typographic behavior of the JIS 213 CJK white left/right parenthesis should be done before final decision. This can still be done before the disposition of comment of the FPDAM1.

Following is the list of affected characters and their old and new properties:

Old: GCat = Ps, EAW = A, Other_Math = Y

New: GCat = Ps, EAW = W, Other_Math = N

----

2329 LEFT-POINTING ANGLE BRACKET
==> 3008

3008 LEFT ANGLE BRACKET

301A LEFT WHITE SQUARE BRACKET

Old: GCat = Ps, EAW = A, Other_Math = N

New: GCat = Ps, EAW = W, Other_Math = N

----

300A LEFT DOUBLE ANGLE BRACKET

3014 LEFT TORTOISE SHELL BRACKET

3018 LEFT WHITE TORTOISE SHELL BRACKET

Old: GCat = Pe, EAW = A, Other_Math = Y

New: GCat = Pe, EAW = W, Other_Math = N

----

232A RIGHT-POINTING ANGLE BRACKET ==>
3009

3009 RIGHT ANGLE BRACKET

301B RIGHT WHITE SQUARE BRACKET

Old: GCat = Pe, EAW = A, Other_Math = N

New: GCat = Pe, EAW = W, Other_Math = N

----

300B RIGHT DOUBLE ANGLE BRACKET

3015 RIGHT TORTOISE SHELL BRACKET

3019 RIGHT WHITE TORTOISE SHELL BRACKET

The new characters added would have the following properties:

GCat = Ps, EAW = Na, Other_Math = Y

----

2B00 MATHEMATICAL LEFT WHITE SQUARE BRACKET

2B02 MATHEMATICAL LEFT ANGLE BRACKET

2B04 MATHEMATICAL LEFT DOUBLE ANGLE BRACKET

2985 MATHEMATICAL WHITE LEFT PARENTHESIS (already part of
amend. 1)

GCat = Pe, EAW = Na, Other_Math = Y

----

2B01 MATHEMATICAL RIGHT WHITE SQUARE BRACKET

2B03 MATHEMATICAL RIGHT ANGLE BRACKET

2B05 MATHEMATICAL RIGHT DOUBLE ANGLE BRACKET

2986 MATHEMATICAL WHITE RIGHT PARENTHESIS (already part of
amend. 1)

If the CJK symbols were added they would have the following properties:

GCat = Ps, EAW = W, Other_Math = N

----

33DE WHITE LEFT PARENTHESIS

GCat = Pe, EAW = W, Other_Math = N

----

33DF WHITE RIGHT PARENTHESIS

----

GCat = Ps, EAW = F, Other_Math = N

----

FF5F FULLWIDTH LEFT WHITE PARENTHESIS

GCat = Pe, EAW = F, Other_Math = N

----

FF60 FULLWIDTH RIGHT WHITE PARENTHESIS

----