Universal Multiple-Octet Coded Character Set (UCS)



Date: 1999-09-07


TITLE: Editorís response to Japan NB comments on N2005 (WD 10646-1 ed.2)

SOURCE: Bruce Paterson, project editor

STATUS: Personal contribution

ACTION: For review and dispositions by WG2


Japanese NB comments on the draft text of 10646-1 2nd Edition (WG2 N 2005) have been distributed as SC2 N 3354 dated 1999-08-31.

This paper gives the editorís proposed Disposition of Comments on each item in that document, for review, amendment as required, and decision by WG2.

The comments in SC2 N 3354 are qualified as either Editorial, Minor Technical, or Major Technical. In the editorís opinion, some of the Minor Technical comments can be regarded as Editorial. Accordingly the qualifications are shown for convenience as follows, after the comment number in the list of proposed Dispositions below.

- Editorial (E),

- Minor Technical (Mi),

- Minor Technical regarded as Editorial (Mi>E),

- Major Technical (Ma).

Editorial comments can be processed immediately for inclusion in 2nd Edition. Other comments will need to be ballotted in a Technical Corrigendum.

1 (E) Accepted. In 4.2 replace "cannot" by "does not".

2 (Mi>E) Accepted in part.

In 4.32 RC-element, after " four octet sequence" insert "(in the canonical form)".

Justification. The fact that RC-elements are used only in UCS-2 and UTF-16 does not affect their definition. The definition should not "preview" the normative statements in the standard. Clause 13.1 and Annex C clearly refer to RC-elements, and clause 13.2 does not.

3 (E) Accepted. 4.33 will be renumbered as 4.32.

4 (Mi>E) Accepted in part. In 4.23 high-half zone, after "may be used" insert "in UTF-16".

In 4.26 low-half zone, make similar change.

Justification: The word "reserved" is correctly used e.g. "seat reservation for Mr X on an aircraft" does not mean that the seat is never used.

5 (E) Accepted in principle. At the beginning of the last paragraaph of clause 5 replace "A UCS Transformation ..." by "Another UCS Transformation ...".

6 (Mi>E) Accepted in part.

- Amend title of 6.5 from "identifiers for characters" to "Short identifiers for characters".

- In 6.5 add "short" before "identifier" wherever it does not already appear, e.g. the paragraph after para e., and in Notes 1 & 2.

Justification: A definition of "short identifier" is not needed since the term is not used outside 6.5.

7 (Mi>E) Accepted. The last sentence of 6.5 will become a Note.

Justification: Users need to know that a character has only one short identifier, worldwide, but may have an alternative name in another language (see Annex L).

8 (E) Accepted. In 6.5 para. 8 the words "CAPITAL" and "SMALL" will be written in lower case.

9 (Mi>E) Accepted. In 13, para 1 will now read:

ISO/IEC 10646 provides four alternative forms of coded representation of characters. Two of these forms are specified in this clause, and two others, UTF-16 and UTF-8, are specified in Annexes C and D respectively.

10 (E) Accepted. In 15 para.2, after "clause 13" insert ", Annex C and Annex D".

11 (Mi>E) Accepted in principle. In 16.2, after the list of sequences insert:

"or from the lists in C.5 and D.6."

12 (Ma) Not accepted.

Justification: The reason given is not correct. Identifications can occur in a protocol data element which is not part of the CC-data-element confirming to UCS, e.g. an attribute field in a file descriptor on an exchangeable CDROM.

13 (E) Accepted. In 25.2, in line 2 of "examples" replace "is" by "as".

14 (Mi) Bullets 1, 2, and 3 : WG2 to decide.

14 Bullet 4 (Mi>E) Accepted. In 25.2 reword the beginning of the last paragraph as follows:

A "unique-spelling" rule is defined as follows. According to this rule, no coded character ...

At the end of the paragraph add new sentence:

This "unique-spelling" rule shall apply in levels 1 and 2.

15 (E) Accepted. In A.1, delete 2nd paragaph, and move Note 1 just before note 2.

16 (E) Accepted in principle. The following changes will be made to A.1.

- Collection 58 appears twice; the second instance will be changed to 59.

- Collections 57, 58, 59, will read "(This collection number shall not be used, see Note3)"

- Collection 299, will read "(This collection number shall not be used, see A.3.2)"

- A new NOTE 3 will be added as follows:

NOTE 3 Collections numbered 57, 58, and 59 were specified in the First Edition of this International Standard but have now been deleted.

17 (E) Accepted. In C.3 and C.6 the Note numbers will be deleted.


18 (E) Accepted in principle. The notation in C.6 and C.7 will be clarified.

In C.6 Example, paragraph 1 will read:

A receiving/originating device which only handles the Basic Latin repertoire, and uses boxes (shown as [*]) to display characters outside that repertoire, would display:

"The Greek letter <alpha> corresponds to <ideograph1>"


"The Greek letter [*] corresponds to [*]."

where the notation <xxx> represents a character outside the Basic Latin repertoire as indicated by xxx.

In C.7 Example, replace "Phoenician" by "Etruscan" throughout.

At the end of the Example add:

where <xxx-High> and <xxx-Low> represent RC-elements from the High-half and Low-half zones respectively corresponding to the character indicated by xxx.

19 (E) Accepted in principle.

Note numbers in D.2 will not change. Note numbers in D.4 become 1 & 2. Note numbers will be removed in D.5 and D.6.

Justification: Notes are numbered within a subclause, according to ITTF drafting rules.

20 (Ma) WG2 to decide, taking into account existing implementations of UTF-8.

21 (E) Accepted in principle.

In Annex L, Rule 3, para. 2, after "symbol" insert "not part of the name" (as in the previous paragaph).

22 (Mi>E) Accepted in principle. In N.3 the entries "UTF16-form (5)" and "UTF8-form (8)" the letters UTF will be changed to utf.

23 (E) Accepted. In Annex P after, paragraph 3, delete "Group 00, Plane 00 (BMP)"

24 (E) Accepted in part. In Annex P, paragraph 2, replace "and " by "preceded by".

Justification: The Annex F style is not suitable, since Annex F is ordered by subject matter, while Annex P is ordered by code position.

25 (E) Accepted in part.

-Clause 26.2 describes hexadecimal and decimal values, and will not be changed.

- In Annex P and Q the terms "hex" and equivalents will be removed.

26 (Mi>E) Not accepted. The editor does not have a suitable ideographic font quickly available.

27 (E) Accepted. In Annex P entry FA1F the hyphen will be deleted.

28 (Mi>E) Accepted. In Annex P entry FFE3 replace "may also be" by "is also".

29 (E) Accepted in principle. The IRG is requested to provide a corrected version of Annex R for the Editor as an electronic file in final form for printing the camera-ready copy as soon as possible.

30 (General >E). Accepted. In 26.1, first paragraph, delete "and in applicable Amendments".