SC2/WG2 N2379


Additional comments for FPDAM ballot on Amd. 1 for 10646-1:2000; i.e. on document SC2/N3530, WG2/N2364.


Document Type: Working Group Document

Source: Kent Karlsson

Status: Expert Contribution

Date: 2001-10-03



The following statement was made by the Swedisn NB:

            De kommentarer som är utlämnade [ur NB-kommentarerna relativt förslag] ansågs som

            fullt relevanta men ej lämpliga att ta upp som kommentarer till en FPDAM. Vi föreslår därför

            att Kent sänder in dem som ett expertutlåtande till editorn så att de inte försvinner totalt.


            The comments that are [were] omitted [from the NB comments, relative to suggested

            comments] were deemed to be fully relevant, but not suitable to submit as [NB comments]

            to an FPDAM. We [the group at the Swedish NB] therefore suggest that Kent [Karlsson]

            submit them as an expert contribution to the editor, so that they do not disappear completely.


Below you find the comments that should be recorded, though not as part of the Swedish NB comments.

Note that some of them are comments suggesting the delayment of the encoding of some characters.

Please consider also these comments at Meeting 41 of SC2/WG2.




 1.        (Not in ballot document): The description about ideographic description

            characters should be moved from annex F to annex P, since the ideographic

            description characters are not alternate format characters.  Add an 'item' to this effect.


 2.        (Not in ballot document): The typographic styles of annex F and annex P should be



 3.        Item 8: The two MES-3 collections (A and B) should include all Cyrillic characters and

            all math operators, all other math symbols,  and all arrows now added via this amendment.

            The LINE SEPARATOR and PARAGRAPH SEPARATOR should be included in each

            of the MES collections, not just the two MES-3. [This is contrary the SE NB comment.]


 4.        Item 14: A semicolon-separated format for the datafile would be preferable, and would

            avoid the byte counting oddity.  Say explicitly that the encoding of the datafile is ASCII.


 5.        Item 14, last 'note': This is both explanatory and confusing.  There seem to

            be no real reason given for the results being different in principle.


            Note for future edition: A future edition of 10646-1 should have, in annex S, a

            complete list of ideographs that were included because of the 'source separation rule',

            not just a list of "source separation examples".


 6.        U+034F COMBINING GRAPHEME JOINER: This is a very strange

            character, not at all like other combining characters, and it is not sufficiently motivated.

            Only very rare, and possibly spurious, possible uses have been cited.  One should at least

            wait with the introduction of this strange beast.  It might not be needed at all.


 7:        U+17D8 (in the Khmer block, not in ballot): The glyph looks exactly as

            the sequence of three characters: <U+17D4, U+179B, U+17D4>.  Mistake?


 8.        U+2071 SUPERSCRIPT LATIN SMALL LETTER I: this one should

            either be moved out of the superscript digits subblock, or simply not be allocated.

            Note that the sequence <0020, 0365> could, and should, be used instead.


 9.        The four proposed characters "SCAN LINE-n" (U+23BA-2U+23BD):

            Remove these.  They only occurred in a few terminals, and don't appear useful even

            today.  Those who absolutely want to support these "characters" (that only appear to

            be spurious 'fill-the-(then)-code-space' things) should use the PUA.   (And, the names

            are confusing for use with modern equipment.)


            There is also a proposal being prepared to include "very, very large sigma" pieces

            for use on fixed cell-size terminals.  They appear entirely unused, and not even the

            proponents can point to anyone using them, nor can they even describe their

            proper use.  All of these unused terminal oddities found in some old terminal

            manuals should be referred to the PUA.


 10.      U+2772, U+2773 (Zapf dingbat tortoise shell brackets): The Zapf dingbats are

            supposedly of a specific font.  However, these tortoise shell brackets don't appear at all

            'ornamental', so they appear very much unifiable with the math version of the tortoise

            shell brackets (WG2/N2345R; not in FPDAM, but referred to in WG 2 resolution M40.7).


 11.      U+FE00 (VARIATION SELECTOR-1): This one is intended for

            marking "variants" of mathematical characters.  It is doubtful that this is a good idea.

            Sweden recommends waiting with this character, and with the entire block.


 12.      The differing covering ranges of the encoding forms UCS-4, UTF-16,

            and UTF-8 is an interoperability problem. Their covering ranges should explicitly

            be made exactly the same: 0000-D7FF union E000-10FFFF. UCS-2 should be

            explicitly deprecated, since it cannot represent all characters (now) in 10646.