L2/07-051

Asmus Freytag
February 2, 2007

In response to action item

109-A9 Asmus Freytag Propose a disposition of the UTR #25 math data tables in line with the proposal in L2/06-352 for the February 2007 UTC meeting. L2/06-352 In progress

I submit this document for discussion at UTC meeting #110.

I have reviewed the single data file for UTR#25 (Mathclass.txt)
and suggest the disposition of the data as below.

This file consists of several fields

 0: code point or range
 1: class
 2: entity name
 3: ISO entity set
 4: remarks

followed by the usual comment with character name, etc.

If the file is split into two files,

MathClass.txt (fields 0 and 1)
EntityMapping.txt (fields 0, 2, 3 and 4)

then the math classification and the mapping information
could be housed in the appropriate locations proposed
in L2/06-352 (with the mapping information, which by
its nature is less subject to versioning, in the MAPPINGS
folder)

The advantage of this is that the classification information
is more readily versioned and available to tools designed
to parse standard property files.

The downside is that proofing the information will be slightly
complicated by the absence of the mapping information.

I believe the latter issue could be handled either by tools
that can merge the files for review, or by retaining some
of the information, but as comment, not data.

There has been a proposal to subdivide the classifiction
to allow the distinction of binary only operators from
binary or unitary operators. This seems appropriate
and a draft is being worked on.

Finally, there has been a suggestion to add another
property to show other layout properties, such as
stretchiness. If this requires a new field, it is better
implemented as an additional data file. We don't
really want to recreate UnicodeData.txt with all
its warts ;-)