Additional comments for FPDAM ballot on Amd. 1 for 10646-1:2000; i.e. on document SC2/N3530, WG2/N2364.
Document Type: Working Group Document
Source: Kent Karlsson
Status: Expert Contribution
The following statement was made by the Swedisn NB:
De kommentarer som är utlämnade [ur NB-kommentarerna relativt förslag] ansågs som
fullt relevanta men ej lämpliga att ta upp som kommentarer till en FPDAM. Vi föreslår därför
att Kent sänder in dem som ett expertutlåtande till editorn så att de inte försvinner totalt.
The comments that are [were] omitted [from the NB comments, relative to suggested
comments] were deemed to be fully relevant, but not suitable to submit as [NB comments]
to an FPDAM. We [the group at the Swedish NB] therefore suggest that Kent [Karlsson]
submit them as an expert contribution to the editor, so that they do not disappear completely.
Below you find the comments that should be recorded, though not as part of the Swedish NB comments.
Note that some of them are comments suggesting the delayment of the encoding of some characters.
Please consider also these comments at Meeting 41 of SC2/WG2.
1. (Not in ballot document): The description about ideographic description
characters should be moved from annex F to annex P, since the ideographic
description characters are not alternate format characters. Add an 'item' to this effect.
2. (Not in ballot document): The typographic styles of annex F and annex P should be
3. Item 8: The two MES-3 collections (A and B) should include all Cyrillic characters and
all math operators, all other math symbols, and all arrows now added via this amendment.
The LINE SEPARATOR and PARAGRAPH SEPARATOR should be included in each
of the MES collections, not just the two MES-3. [This is contrary the SE NB comment.]
4. Item 14: A semicolon-separated format for the datafile would be preferable, and would
avoid the byte counting oddity. Say explicitly that the encoding of the datafile is ASCII.
5. Item 14, last 'note': This is both explanatory and confusing. There seem to
be no real reason given for the results being different in principle.
Note for future edition: A future edition of 10646-1 should have, in annex S, a
complete list of ideographs that were included because of the 'source separation rule',
not just a list of "source separation examples".
6. U+034F COMBINING GRAPHEME JOINER: This is a very strange
character, not at all like other combining characters, and it is not sufficiently motivated.
Only very rare, and possibly spurious, possible uses have been cited. One should at least
wait with the introduction of this strange beast. It might not be needed at all.
7: U+17D8 (in the Khmer block, not in ballot): The glyph looks exactly as
the sequence of three characters: <U+17D4, U+179B, U+17D4>. Mistake?
8. U+2071 SUPERSCRIPT LATIN SMALL LETTER I: this one should
either be moved out of the superscript digits subblock, or simply not be allocated.
Note that the sequence <0020, 0365> could, and should, be used instead.
9. The four proposed characters "SCAN LINE-n" (U+23BA-2U+23BD):
Remove these. They only occurred in a few terminals, and don't appear useful even
today. Those who absolutely want to support these "characters" (that only appear to
be spurious 'fill-the-(then)-code-space' things) should use the PUA. (And, the names
are confusing for use with modern equipment.)
There is also a proposal being prepared to include "very, very large sigma" pieces
for use on fixed cell-size terminals. They appear entirely unused, and not even the
proponents can point to anyone using them, nor can they even describe their
proper use. All of these unused terminal oddities found in some old terminal
manuals should be referred to the PUA.
10. U+2772, U+2773 (Zapf dingbat tortoise shell brackets): The Zapf dingbats are
supposedly of a specific font. However, these tortoise shell brackets don't appear at all
'ornamental', so they appear very much unifiable with the math version of the tortoise
shell brackets (WG2/N2345R; not in FPDAM, but referred to in WG 2 resolution M40.7).
11. U+FE00 (VARIATION SELECTOR-1): This one is intended for
marking "variants" of mathematical characters. It is doubtful that this is a good idea.
Sweden recommends waiting with this character, and with the entire block.
12. The differing covering ranges of the encoding forms UCS-4, UTF-16,
and UTF-8 is an interoperability problem. Their covering ranges should explicitly
be made exactly the same: 0000-D7FF union E000-10FFFF. UCS-2 should be
explicitly deprecated, since it cannot represent all characters (now) in 10646.