ISO/IEC JTC 1/SC 2/WG 2 PROPOSAL SUMMARY FORM TO ACCOMPANY SUBMISSIONS FOR ADDITIONS TO THE REPERTOIRE OF ISO/IEC 10646

Please fill Sections A, B and C below. Section D will be filled by SC 2/WG 2.

For instructions and guidance for filling in the form please see the document " Principles and Procedures for Allocation of New Characters and Scripts" (http://www.dkuug.dk/JTC1/SC2/WG2/prot)

1. Title: Disunify braces/brackets for math, computing science, and Z notation from similar-looking CJK braces/brackets

2. Requester's name: Kent Karlsson (and Asmus Freytag?)

3. Requester type (Member body/Liaison/Individual contribution):

4. Submission date: 2001-01-16

5. Requester's reference (if applicable):

6. (Choose one of the following:)
This is a complete proposal

1.2.                    B. Technical - General

1.b. The proposal is for addition of character(s) to an existing block: X
Name of the existing block: Miscellaneous mathematical symbols

2. Number of characters in proposal: 6

3. Proposed category (see section II, Character Categories):

4. Proposed Level of Implementation (see clause 15, ISO/IEC 10646-1): 1
Is a rationale provided for the choice? Yes
If Yes, reference: (simple graphical characters, no combining or other implementation difficulties)

5. Is a repertoire including character names provided?: Yes

a. If YES, are the names in accordance with the 'character naming guidelines' in Annex K of ISO/IEC 10646-1? Yes
b. Are the character shapes attached in a reviewable form? […]

6. Who will provide the appropriate computerized font (ordered preference: True Type, PostScript or 96x96 bit-mapped format) for publishing the standard? Unicode Consortium?

If available now, identify source(s) for the font (include address, e-mail, ftp-site, etc.) and indicate the tools used:

7. References:
a. Are references (to other character sets, dictionaries, descriptive texts etc.) provided?

b. Are published examples (such as samples from newspapers, magazines, or
other sources) of use of proposed characters attached?

8. Special encoding issues:

Does the proposal address other aspects of character data processing (if applicable) such as input, presentation, sorting, searching, indexing, transliteration etc. (if yes please enclose information): No.

1.3.                    C. Technical - Justification

1. Has this proposal for addition of character(s) been submitted before? No.

If YES explain

2. Has contact been made to members of the user community (for example: National Body, user groups of the script or characters, other experts, etc.)?

If YES, with whom?
If YES, available relevant documents?

3. Information on the user community for the proposed characters (for example: size,
demographics, information technology use, or publishing use) is included? No.
Reference:

4. The context of use for the proposed characters (type of use, common or rare) Common in math, computing science, and Z notation
Reference:

5. Are the proposed characters in current use by the user community? Yes
If YES, where? Reference:

6. After giving due considerations to the principles in N 1352 must the proposed
characters be entirely in the BMP? Yes
If YES, is a rationale provided? Yes
If YES, reference: (Co-location with similar (and less used) characters in the misc. math. symbols block.)

7. Should the proposed characters be kept together in a contiguous range (rather than
being scattered)? Nearly (see detailed proposal below).

8. Can any of the proposed characters be considered a presentation form of an existing
character or character sequence? No.
If YES, is a rationale for its inclusion provided?
If YES, reference:

9. Can any of the proposed character(s) be considered to be similar (in appearance or function) to an existing character? Yes.
If YES, is a rationale for its inclusion provided? Yes.
If YES, reference: (Though similar in appearance to some CJK punctuation, the use context, typographic appearance, and typographic spacing properties are different.)

10. Does the proposal include use of combining characters and/or use of composite sequences (see clause 4.11 and 4.13 in ISO/IEC 10646-1)? No.
If YES, is a rationale for such use provided?
If YES, reference:

Is a list of composite sequences and their corresponding glyph images (graphic symbols) provided?
If YES, reference: N/A

11. Does the proposal contain characters with any special properties such as control function or similar semantics? No.
If YES, describe in detail (include attachment if necessary)

1.4.                    D. SC 2/WG 2 Administrative (To be completed by SC 2/WG 2)

1. Relevant SC 2/WG 2 document numbers:

2. Status (list of meeting number and corresponding action or disposition):

3. Additional contact to user communities, liaison organizations etc:

4. Assigned category and assigned priority/time frame:

A. List of suggested characters

List with suggested code position, and suggested name, of the six characters suggested by this proposal:

2997, LEFT DOUBLE SQUARE BRACKET

2998, RIGHT DOUBLE SQUARE BRACKET

29D8, LEFT ANGLE BRACE

29D9, RIGHT ANGLE BRACE

29DA, LEFT DOUBLE ANGLE BRACE

29DB, RIGHT DOUBLE ANGLE BRACE

B. Use of the suggested characters

The double square brackets,

2997, LEFT DOUBLE SQUARE BRACKET

2998, RIGHT DOUBLE SQUARE BRACKET

are commonly used in computing science (and Z notation) as “abstract syntax” brackets.  In papers or books they are usually produced by kerning [[ and ]] respectively till the glyphs touch (sometimes using custom-made TeX commands, by varying names, if TeX is used).

The single angle braces,

29D8, LEFT ANGLE BRACE

29D9, RIGHT ANGLE BRACE

are commonly used in math and computing science as tuple brackets (or sequence bracket, as in Z notation). These are produced in LaTeX by \langle and \rangle.

The double angle braces,

29DA, LEFT DOUBLE ANGLE BRACE

29DB, RIGHT DOUBLE ANGLE BRACE

are used in Z notation as data braces. These may be produced in LaTeX by custom-made  \ldata and \rdata commands.

C. Similar characters

Note that the “miscellaneous technical” symbols

2329, LEFT-POINTING ANGLE BRACKET

232A, RIGHT-POINTING ANGLE BRACKET

are in Unicode canonically equivalent to

3008, LEFT ANGLE BRACKET

3009, RIGHT ANGLE BRACKET

respectively.  To make these canonically equivalent may have been a mistake, but the equivalence is firmly entrenched, and cannot now be revoked.

3008 and 3009 are CJK punctuation characters, similar in use to

2039, SINGLE LEFT-POINTING ANGLE QUOTATION MARK

203A, SINGLE RIGHT-POINTING ANGLE QUOTATION MARK

and used in normal running text.

U+3008 (U+2329) and U+3009 (U+232A) are typeset with extra white-space on the outer side to make them each as wide as a CJK ideograph.  This makes 2329/3008 and 232A/3009 unsuitable for the common math expression use.

Further,

300A, LEFT DOUBLE ANGLE BRACKET

300B, RIGHT DOUBLE ANGLE BRACKET

are also CJK punctuation characters, similar in use to

00AB, LEFT-POINTING DOUBLE ANGLE QUOTATION MARK

00BB, RIGHT-POINTING DOUBLE ANGLE QUOTATION MARK

and used in normal running text.

Also U+300A and U+300B are typeset with extra white-space on the outer side to make them each as wide as a CJK ideograph.  This makes 300A and 300B unsuitable for the math (Z notation in particular) expression use.

Finally,

301A, LEFT WHITE SQUARE BRACKET

301B, RIGHT WHITE SQUARE BRACKET

are also CJK punctuation.

Also U+301A and U+301B are typeset with extra white-space on the outer side to make them each as wide as a CJK ideograph. This makes 301A and 301B unsuitable for the computing science expression use.

D. Discussion

This is a proposal to disunify the math brackets from the CJK quotation brackets, adding the math versions to the new miscellaneous math symbols block, where there are similar brackets/braces.  If possible, that block could be rearranged to put all the brackets/braces (including these disunified ones) together.

Since the characters similar to the ones here suggested are in the CJK punctuation block,

they are not widely recognised as “mathematical” characters, and font and other support that otherwise try to cover “mathematical” characters usually misses out on all CJK punctuation (including standard “character collections” as listed in 10646 itself), even though some of the CJK punctuation is currently unified with the here suggested “mathematical” characters.

Further, e.g., for the single LEFT ANGLE characters (for illustration here written as <) if two of them are displayed (printed) together they would look like this “<<” in Latin typography and like this “ < <” (note the white-space to the left of each < glyph)  in East Asian typography. An appearance like “ < <” is not acceptable in East Asian typography either, so software recognizes the character (not the glyph) and kern these as follows: “ <<” (notice that the space on the left of the first character remains).  Similarly for the other CJK punctuation characters. This kerning is often called 'ideal width' or 'algorithmic kerning' since it does not use glyph metrics, but assumes the presence of the white space by knowledge of the character code. This is very different from what is done in Latin typography, but normal in East Asian usage.

Because of these reasons, disunifying the brackets/braces, adding new “mathematical” ones in the new Miscellaneous Mathematical Symbols block, as suggested, where similar “math” characters reside, is useful.

E. References

ISO/IEC WD2.6 13568, Formal Specification – Z Notation – Syntax, type and semantics.

TeX, LaTeX books…

Math text books…

CS text books…