1997-Jan-09 Disposition of Comments


Danish Standards Association, Denmark

SC 2 N1456
1996-08-02
Danish comments on JTC 1 N3745

1.   DS would like that further development of the model be done,
  so that [it] includes terms like:  character, character set,
  character repertoire, coded character set, code table, character
  encoding scheme, glyph, font.

  Keld Simonsen provided the following additional charification
  of the Danish comments on 1996-08-17 at the Quebec City
  meeting of WG2:

  Here are some definitions that I think is needed in the
  character/glyph TR to explain the SC2 concepts at an overall
  level, to get the meaning and relations between them. These
  explanations are also very needed in the context of
  programming languages, where WG20 is working on a guide to the
  design of programming languages, that builds on concepts like
  these, and an API standard for character handling. Also the
  Internet is working on concepts for character handling, where
  a SC2/SC18 explanation of the concepts and relations would be
  most welcome.

  The terms are taken from SC2 standards, especially from IS
  10646, but some of the terms are not defined there, although
  they are used, for example "coding system".

  Definitions:

  "character": a member of a set of elements used for the
  organization, control, or representation of data. (10646)
  Kelds comment: characters are meant to represent some kind of
  sound, at least for letters.

  "repertoire": A specified set of characters that are
  represented in a coded character set. (10646)
  Kelds comment: a repertoire contains a *finite* set of
  elements in the form of characters.

  "coded character set": A set of unambiguous rules that
  establishes a character set and the relationship between the
  characters of the set and their coded representation. (10646)

  Kelds comment: in ISO this means at least 3 things:

  1.   in ISO 2375 (the ECMA registry) this is a part known as
     G0/G1/G2/G3 or C0/C1 for graphic and control characters
     respectively. The control CCS are 32 cells each, while the G0 is
     94 cells and G1 is 96 cells respectively for 7/8 bit CCS. There
     are also a size for 14-bits CCS which I cannot remember by now,
     and also specific sizes for special CCS of 256 cells.
  2.   in the 8859 series a CCS can consist of for example 2
     registered CCS of 2375. For example ISO 8859-1 is defined as ISO
     2375 register number 6 and number 100- it does not have control
     characters. ISO 646 IRV (US-ASCII) is defined as the graphical
     CCS registration number 6 and the control CCS registration number
     1. ISO 6937 is defined as ISO 2375 registration number 2 and then
     (I think ) registration 101, with some specific rules for
     combinations of two octets into one character.
  3.   In ISO 9945 (POSIX) standards, a CCS is taken to be a
     description of the whole datastream, that is including both the
     control characters and the graphic characters. This is close to
     the MIME "charset" definition, but POSIX does not have
     specification techniques (yet) to define state-dependent
     encodings like 2022 based encodings. POSIX can handle ISO 6937
     and Shift-JIS and UTF-8 with its charmap specification technique.

  "coding system" is a set of unambiguous rules that establishes
  a character set and the relationship between the characters of
  the set and their coded representation, by combining more
  simple coded character sets or encoding a simple coded
  character set in another way. Examples are 2022 or UTF-8, the
  2022 coding system works on one or more simple CCS as
  registered with ISO 2375. The procedures of forming characters
  in ISO 6937 is not a coding system, this is a simple coded
  character set.

  Relation between a character repertoire and its encoding.

  The encoding of a character repertoire can consist of at least
  three parts:

  1.   the simple coded character sets, for example ISO 8859-1
     combined with the control character set of ISO 6429.

  2.   The rules for combining or coding one or more simple coded
     character sets, for example 2022 or UTF-8.

  3.   A symbolic character notation, like SGML entities of the
     type &aacute;

  (maybe the transformation format like UTF-8 or UTF-16 should
  have a specific term, transformation format.)

RESPONSE : Additional terms were added to the definitions, but in
general, the scope of this technical report is to address
“Character”, “Glyph”, and the transition point between those two
concepts.  It is beyond the scope of this particular technical
report to address the general framework of SC2 character
processing. It would appear that such a technical report would be
useful to SC2, and that a new project could be initiated for it
if there is sufficient support within SC2.

NTS/IT, Norway

                                       Tuesday, November 12, 1996

1.   Please notice:  In reference to document ISO/IEC JTC 1/SC 2
  N 2746 - requesting comments on the TR as mentioned above, - here
  are some comments from the Norwegian NB:

  The content of "An operational model . . " is OK as far as we
  are able to judge.

  What we eventually miss is a demarcation between the concepts
  "glyph" and "grapheme".  "Grapheme" is the concept used in
  linguistic theory in the following sense:

  Allograph

  One of a group of variants of a grapheme or written sign.  It
  usually refers to different shapes of letters and punctuation
  marks, e.g., lower case, capital, cursive, printed, strokes,
  etc., (cf. Allophone, allomorph).

  Grapheme

  A minimum distinctive unit of the writing system of a
  particular language, like the concept of "phoneme" and
  "morpheme", the grapheme has no physical identity, but is an
  abstraction based on the different shapes of written signs and
  their distribution within a given system.  These different
  variants, e.g., the cursive and printed shapes of letters M,
  m, cursivated m, M, etc. in an alphabetic writings system are
  all allographs of the grapheme /m/.

  Source:  R. R. K. Hartmann and F. C. Stork 1976:  Dictionary
  of language and linguistics.  Applied Science Publishers Ltd.,
  London.

  As can be seen, "glyph" and "grapheme" are clearly related,
  partly overlapping concepts, so to avoid confusion a note on
  the similarities and dissimilarities between the two should be
  made.  The difference is above all that the grapheme concept
  is defined in relation to writing systems of particular
  languages, whereas the glyph concept is defined with no
  relation to language whatsoever.

  *******
RESPONSE : An extract of the Norwegian comments will be added as
an informational note at the end of clause C.1 to draw attention
to the fact that this report does not address these linguistic
concepts, but it is beyond the scope of this technical report to
define the linguistic concepts or to describe the relationship
between glyphs and graphemes in greater detail.

G. Kenneth Holman

                                            Thursday, 10 Oct 1996

I just read the Character/Glyph model and I'm pleased that it
turned out so well.  I've been trying to describe this to others
in the past, and this document appears to succinctly say it all.

1.   My only comment is that there are a number of incorrect
  glyphs: on the page in the examples in Annex B.
  [The following is a clarification in a Dec. 2, 1996 message.]
  The copy I received from Standards Council of Canada for
  review does not have the correct glyphs printed in section B.3
  for 2460 and 2780 (which I see now were printed twice; I
  thought that there were four problems), not because of any
  expertise on my part, but because they are rendered as empty
  rectangular boxes.

RESPONSE : The text of the technical report is correct, in the
choice and representation of the glyphs.  However, this comment
reveals a problem that needs to be addressed by the SC
Secretariats and the ITTF in distribution of softcopy standards
for review. When a standards document contains graphics, images,
and fonts that are not widely available to all recipients, the
review and balloting of the document will be subject to many
unnecessary comments and wasted time by both the reviewers and
the editors.

2. I think that the German quotes at the end of page 6 are
  reversed, showing end quote followed by start quote, where the
  other two examples are start quote followed by end quote.

RESPONSE : The German quotes were verified in the AFII glyph
register and are correctly represented.

James Agenbroad

                                         Monday, November 4, 1996

Here are a few brief comments on the August 19 version of the
'Operational Model for Characters and Glyphs'.

1.Just before section 5 it seems to suggest that combining
  characters have no glyph; isn't it more accurate to say the
  position of the glyph of a combining character depends on what
  precedes it?

RESPONSE : The paragraph will be revised.

2.In the last paragraph of section 5.1, isn't the third
  necessity a bridge between the other two?

RESPONSE : The paragraph will be revised.

3.Page 7 refers to the Devanagari ra's half or eyelash glyph but
  does not show it in figure 5.  is this because selecting this
  glyph would be so difficult it may require a separately
  encoded character?

RESPONSE : The glyph will be added to the figure.

4.On page 8 a "j" is missing from "justify" and "justification"
  and on page 11 from "Romaji"--could this be my printer?

RESPONSE : This appears to be a similar softcopy distribution
problem to that noted above for Ken Holman’s comments.  The
submitted text does contain the correct glyphs.

5.In B.1, can't some characters, e.g., space, have both "data"
  and "control" functions?  In the next paragraph, characters
  are enumerated by assigning a unique name to each--except
  Chinese, Japanese and Korean ideography, I think.

RESPONSE : The paragraph will be revised to address the first
comment. For the second comment, the Chinese, Japanese, and
Korean (CJK) characters do have unique names (e.g., CJK
Compatability Ideograph FAD3), they are simply not descriptive
names.

6.In C.3, third paragraph, "Character-to-glyph mapping tables
  are not defined by ISO standards"  why not since they're the
  necessary bridge between characters and glyphs?  It seems to
  me that without some agreement on this the two important of
  text processing have not been brought together.  This is not
  to say it will be easy to do so, it might involve different
  levels of typographic excellence, e.g., an internet note need
  not be beautiful, but some agreement on communication seems
  essential.

RESPONSE : It may be useful for such a standard to be developed,
but it is beyond the scope of this technical report.

Olle Järnefors

                                        Sunday, December 01, 1996

ISO/IEC JTC 1/SC 2 N2746
 18 August, 1996
 Title: August 1996 working draft of TR 15285, "An operational
model for characters and glyphs"

In the following, lines starting with ">" are quoted from the
draft technical report. I have replaced non-ASCII characters with
the marker "!?".

Lines starting with "/" are my suggestions for new text to
replace text in the draft.

Lines starting with "+" are my suggestions for new text to be
added to the draft.

1.Introduction

  > People recognize and process characters[1] by their shapes.
  > Thus, people normally closely associate a character[2] and
  > its shape.  Information technology, in contrast, makes
  > distinctions between the concepts of a character's[3]
  > meaning (the "character"[4]) and its shape (the "glyph").
  > The close association people make between characters[5] and
  > glyphs, and the distinction made by information
  > technology have produced a conflict that has led to
  > misunderstanding and confusion.

  The third sentence talks about a character's "meaning". But
  normally individual characters don't have a definite meaning,
  not in the way individual words of a language have meaning.

  What _is_ common for all specimens of a certain character
  then? Take the letter "d" as an example. What's common to all
  individual d's is, in my view, that they (and they only) can
  fulfill the same _function_ when writing words which are spelt
  with this letter.

  Individual characters are not "elements of meaning", I would
  say, but elements of meaningful written linguistic
  expressions, particularly words. The distinction between sign
  and meaning is fundamental to semantics. To me it's clear that
  letters belong to the sign side of this dichotomy, not the
  meaning side. And the same is true for digits. The digit "1"
  can't be identified with the meaning "the least positive
  integer". Often it means the number one, but in "123" it means
  the number 100, and in other contexts it may have no relation
  at all to numerical quantities. In my opinion these
  observations generalize to almost all graphic characters of
  ISO 10646.

  Digression about the how to define the idealized concept of
  (graphic) character:

     It's true, however, that meaning plays an important role in
     the demarcation of different characters, i.e. in any
     definition of the idealized character concept. The draft
     rightly emphasizes that the abstraction from concrete marks
     on e.g. a paper to abstract _glyphs_ ideally should be based
     only on consideration of geometrical shapes. Shape is
     important also in the abstraction from concrete marks to
     _characters_, though only indirectly. Meaning is the
     important consideration, and I offer the following attempted
     definition to show how:

      A character is a mathematical set of physical marks such
     that any of them can be substituted for any other without
     changing the meaning of the text where it occurs.

      (Here I ignore complications not relevant to the scope of
     the technical report, such as the atomicity of characters
     and the dependence of some existing character distinctions
     upon the writing system used. Therefore the modest label
     "attempted definition".)

      Three things should be noted:

     1) Indirectly the _shape_ of concrete marks are important
     for which characters they are realizations of, because shape
     distinctions are essential for the ability of humans to
     discern contrasts of meaning between similar text pieces.

     2) This definition isn't my free invention. It's actually
     equivalent to standard definitions in linguistics of the
     concept of _grapheme_.

     3) It defines an idealized character concept. The
     _pragmatic_ character concept should be defined as "any
     entity coded in a coded character set standard" (who knows
     what kinds of things might have been included in some coded
     character set defined somewhere by some crazy engineers? or
     will be in the future). This definition needn't be circular.
     A "coded character set standard" can be defined as a
     standard that characterizes itself with the expression
     "coded character set standard" or an equivalent label.

  End of digression.

  To return to the text quoted above, it uses the word
  "character" in two or possibly three different senses, which
  unfortunately can add to the very confusion it describes:

     In the occurrences 1, 2, and 5 the word "character" means
     some abstraction of physical shapes used in writing.

     In occurrence 3 it means a _combination_ of a meaning
     (whatever that may be) and a physical shape (probably not an
     individual physical shape on e.g. a certain piece of paper,
     but an "abstract" shape).

     In occurrence 4 it seems to mean some kind of meaning, not a
     thing that _has_ a meaning.

  This double/triple use of one word also accounts for the
  paradoxical wording about the "character"[4] being an aspect
  of the character[3].

  A replacement for the quoted text could be something like
  this:

  / In all reading and writing of text people recognize the
  / individual physical marks read or produced on the
  / writing surface as different realizations of abstract
  / letters, ideographs, digits, symbols, and other
  / characters. The digital representation of these
  / entities is the main task of SC2 standards for coded
  / character sets. Another kind of abstract entities
  / related to the physical marks of concrete text, glyphs,
  / is central to SC18 standardization of font technology.
  / The relations between the two concepts of character and
  / glyph, which are easy to confuse, is the subject of
  / this technical report.

  The text in the draft continues:

  > The successful
  > promulgation and implementation of character coding, text
  > editing, presentation and publication standards require
  > an understanding of the appropriate use of character
  > codes and glyph identifiers.

  I don't think it's necessary to introduce the technical
  notions of "character code" and "glyph identifier" at this
  early point in the report. Furthermore, it should be mentioned
  that in certain kinds of simple data processing, the
  distinction between character and glyph isn't needed. Proposed
  new text:

  / The successful promulgation and implementation of
  / character coding, text editing, presentation and
  / publication standards require an understanding of the
  / distinction between characters and glyphs, except for
  / those simple applications where it is acceptable that
  / the same glyph is always used the same character.

RESPONSE : The text of the Introduction will be revised to
address the concerns expressed, though the specific suggested
wording may not be used as expressed in the comments.  It is our
desire to keep the introductory text free of too much technical
detail, so readers who have little or no understanding of SC2 and
SC18 concepts can determine the relevancy of this technical
report to their needs.

2.Clause 4.  Character and glyph distinctions
  >
  > The character and glyph definitions in clause 3, which
  > were taken from ISO/IEC 10646 and ISO/IEC 9541, were
  > developed independently and contain terminology that
  > requires harmonization and explanation.

  "Harmonization" of two terminologies, as distinct from mere
  explanation, to me suggests that definitions of some terms are
  changed or new terms are introduced with new definitions. Is
  that part of the purpose of this technical report?

RESPONSE : The word “harmonization” will be deleted from the
sentence.

3.> In information technology, characters are abstract
  > information elements in the domain of coding for data
  > interchange.

  This is a statement about information technology in general,
  not restricted to coded character set standards. Therefore I
  believe it's more correct to write "... the domain of coding
  for data representation, particularly data interchange". Much
  text stored in a computer never leaves the local system, it's
  never interchanged, and still it is coded according to coded
  character set standards.

RESPONSE : The suggested phrase will be used as replacement text.

4.> Coded character set standards assign
  > numeric values, character names (descriptive text), and
  > representative (sample) images to each character
  > contained in a coded character set.

  The significance of the parenthesis in "character names
  (descriptive text)" is unclear. I think it would be better to
  leave it out and instead add the sentence:

  + Typically a character is given a multi-word name which
  + also serves as an adequate description of the
  + character, making it clear how it differs from the
  + other characters of the coded character set.

RESPONSE : “(Descriptive text)” will be removed, and revised text
will be added.

5.> The precise semantics and appearance of the information
  elements in
  > any given implementation are not defined by those coded
  > character set standards.

  As I explained above I don't think that characters have any
  semantics (meanings). What I think is the important thing to
  say here is that coded character set standards don't include
  explicit criteria for drawing the line between similar but
  distinct characters.

  Possible new formulation:

  / Criteria for the demarcation between nearly related
  / characters, to aid decisions about which characters to
  / choose for representing a particular text, are not
  / included in those coded character set standards, other
  / than the guidance given by the character name and one
  / concrete example of the character.

RESPONSE : Rejected; the precise semantics and appearance are not
defined by the coded character set standards.

6.> The ISO/IEC 10646
  > standard recognizes the distinction between characters
  > and their visual representation by defining the term
  > "graphic symbol".  The "graphic symbols" of SC 2
  > standards and the "glyphs" of SC 18 standards represent
  > equivalent concepts.

  Is this really true? As I read the SC2 definition of "graphic
  symbol"

  > 3.12  graphic symbol : The visual representation of a
  > graphic character or of a composite sequence. (ISO/IEC
  > 10646-1: 1993).  [See the definition of "glyph".]

  it may very well be interpreted to refer to the _concrete_
  physical mark used to represent a character on a particular
  paper or on a screen at a certain point of time. The SC18
  concept of "glyph" is an _abstract_ image, it is abstracted in
  some other way than characters are, and I have never seen any
  discussions about this alternative abstraction process in SC2
  contexts. I thus believe that the "concrete" interpretation of
  the SC2 concept of "graphic symbol" that I have formulated
  here is more plausible. In that case "graphic symbol" should
  be equated with the SC18 concept of "glyph image", not "glyph"
  (although the SC2 concept has wider applicability than the
  SC18 concept, being relevant also for hand-written text).

RESPONSE : Correct; the reference to “Glyph” will be changed to
“Glyph Image” in both this sentence and in the definition of
Graphic Symbol.

7.> The historical association of characters and glyphs has
  > resulted in character sets maintaining distinctions that
  > cannot be founded on distinctions in content, but only
  > distinctions in form; similarly, the glyph registration
  > authority and the SC 18 font resource model have made use
  > of criteria based on content to abstract potential
  > distinctions in form.

  This is a very important point. It may be obscured for many
  readers by the use of the notoriously ambiguous words "form"
  and "content". I would prefer a wording such as the following:

  / The historical association of characters and glyphs has
  / resulted in character sets maintaining distinctions
  / that cannot be motivated by the capacity of the
  / distinguished characters to cause a contrast in meaning
  / in a text. Exchanging them for each other will only
  / change the appearance of the text. Similarly, the glyph
  / registration authority and the SC 18 font resource
  / model have made use of criteria based on meaning, not
  / shape, to abstract distinctions between glyphs.

RESPONSE : Use of the words “form” and “content” will be replaced
by the words “shape” and “meaning”.

8.> For example, in ISO/IEC 10646-1, SC 2 coded the glyph FB03
  LATIN SMALL LIGATURE
  > FFI "!?" for round-trip integrity with other standards.
  > (See B.4 The "round-trip rule" on page 13.).

  I would prefer a simpler example than this, which involves a
  compatibility character that is equivalent with a _sequence_
  of "genuine" characters, not a single character.  The
  preceding text doesn't mention this complication. Why not use
  FF21 FULLWIDTH LATIN CAPITAL LETTER A and 0041 LATIN CAPITAL
  LETTER A as an example?

RESPONSE : We disagree; we believe the ffi makes an excellent
example.

9.> Also, the SC 18 Registration Authority (AFII) for ISO/IEC
  10036
  > could have registered the same glyph identifier for the
  > "!?" glyph and used it for both the 212B ANGSTROM SIGN
  > "!?" character and the 00C5 LATIN CAPITAL LETTER A WITH
  > RING ABOVE "!?" character.  However, AFII instead
  > registered two glyph identifiers.

  This is a needlessly confusing example, since it involves also
  a false distinction between _characters_ in UCS. 212B and 00C5
  are different characters only because they are included as
  such in some coded character set standard, viz. ISO 10646.
  (00C5 is a genuine character, 212B is a compatibility
  character.) It's as absurd to regard these as different
  characters as it is to say that in the sentence

     The speed of light in vacuum is exactly 299792458 m/s.

     the "metre symbol" in "m/s" is another character than the
     ordinary letter "m" in "vacuum".

  (Can anybody clarify if some earlier standard also made the
  distinction between 212B and 00C5? That would at least
  motivate their inclusion into UCS by the round-trip rule.)

  This particular false character distinction has to do with
  treating the same character as different characters depending
  on what _function_ it fulfills (letter in a word, or symbol),
  not with confusing glyph distinctions with character
  distinctions. It therefore falls outside the scope of this
  technical report, and this example should be removed, both
  here and in section E.1.

  Furthermore, this example was supposed to show problems with
  SC18 _glyph_ distinctions, not SC2 character distinctions. A
  better example is needed. I don't have access to the glyph
  registry, unfortunately, but I suspect that different glyphs
  have been registered for Latin capital A, Cyrillic capital A,
  and Greek capital alpha. If that's the case, it would provide
  an excellent example for this place in the technical report.

RESPONSE : Good point; the suggested example will be used.

10.    > Within the realm of information technology, an ideal
  > characterization of characters and glyphs and their
  > relationship may be stated as follows:
  > ...
  > -- One or more characters may be depicted by no, one, or
  > multiple glyph representations (instances of an abstract
  > glyph) in a way that may depend on the context.
  >
  > The relationship between coded characters and glyph
  > identifiers may be one-to-one, one-to-many, many-to-one,
  > or many-to-many.  In its fully general form, it is a
  > context-sensitive M-to-N mapping where M > 0, N ( 0.

  Unfortunately, this is a too simple picture. We actually have
  two different kinds of multiplicity:

     1) Some characters can be realized by a combination of 2 or
     more glyphs, such as the 0132 LATIN CAPITAL LIGATURE IJ.

     2) Other characters can be realized by different single
     glyphs in the same font, depending on the context, such as
     the four different glyphs needed for each Arabic letter in
     row 06 of UCS, depending on the character's positions in the
     beginning, middle, or end of a word, or in isolation.

  In my opinion the text of the technical report should mention
  this complication.

RESPONSE : Rejected; the existing text covers both illustrations
mentioned, and in effect addresses other more complex
illustrations.

11.    > (For some characters in ISO/IEC 10646-1, no glyph can
  be
  > defined, for example, the ZERO WIDTH NO-BREAK SPACE.)

  I would say that ZERO WIDTH NO-BREAK SPACE and all the
  characters in the range 200B - 200F, 2028 - 202E, 206A - 206F
  are not _graphic_ characters but _control_ characters: They
  don't correspond to any glyph, not even some amount of white
  space. On the other hand, they have various other useful
  effects on the organization or control of data. This is
  similar to the roles played by the control functions of ISO
  6429. And SC2 doesn't recognize any third category of
  characters, besides graphic characters and control characters.

  This makes ZERO WIDTH NO-BREAK SPACE fall outside the scope of
  the technical report, I think. It would be an improvement to
  include a paragraph early in section 4 that explains the
  difference between graphic characters and control characters
  and states that the report is only concerned with graphic
  characters, using the word "character" as an abbreviation of
  "graphic character".

RESPONSE : Rejected; the statement made in the technical report
is true, even though the character may serve some other
functions.

12.    > This is particularly true for ISO/IEC 10646
  > implementation level 3, which uses combining characters.

  This sentence is probably difficult to understand for many
  readers. It could either be removed or expanded to a full
  paragraph, describing the particular complications with font
  support for combining characters.

RESPONSE : The context of the sentence was confusing, and the
sentence was moved in response to Agenbroad’s comment 1.

13.    > Clause 5.2.  Composition, layout, and presentation
  >
  > The composition and layout process spans both processing
  > domains.  See Figure 2.

  I suppose the concepts "composition" and "layout" have well-
  defined SC18 meanings. Spontaneously I myself think of
  composition as the process of creating new text or data,
  normally performed by a human user (entering data or editing
  text). But it's clear from Figure 2 that composition as the
  word is used here is something else, needed for the output of
  text, probably a fully automatic process. Perhaps it would be
  possible to include definitions of these terms in section 3,
  or at least include a discussion in section 5.2 of their
  meanings as used here?

RESPONSE : A parenthetic expression will be added to clarify the
meaning of composition and layout.

14.    > Glyph selection is the process of selecting (possibly
  > through several iterations) the most appropriate glyph
  > identifier or combination of glyph identifiers to render
  > a coded character or composite sequence of coded
  > characters. Coded characters and their associated
  > implicit or explicit formatting information represent the
  > primary inputs to composition and layout processing, and

  The "associated formatting information" that exist together
  with coded characters is a new component of the picture, quite
  abruptly introduced here. I suppose such things as HTML tags
  are referred to by this phrase. I would like to see a short
  discussion about plain text, rich text, and "formatting
  information" somewhere before this point in the technical
  report.

RESPONSE : A parenthetic expression will be added to clarify the
meaning of associated formatting information.

15.    > The degree of glyph
  > selection intelligence and the positioning of that glyph
  > selection intelligence varies widely among existing
  > standards and implementations.

  I don't understand how an intelligence can be positioned. Has
  some piece of the original text disappeared here?

RESPONSE : The sentence will be rephrased.

16.    > Clause 6.  Glyph selection

  > -- When a 0022 QUOTATION MARK """ character is
  > encountered, a composition and layout process may have to
  > determine whether it begins or ends a quotation and then
  > choose either an opening or closing quotation mark glyph
  > as appropriate. Alternatively, the process
  > may select glyphs depending on the language of the text
  > being formatted (or the formatting style specifications
  > that apply to the content being formatted).  For example,
  > German text could substitute the "!?" and "!?" glyphs
  > for quotation marks; and French text, the "!?" and
  > "!?" glyphs.

  I don't think this is a clear-cut example of the need to use
  style information and context in the composition and layout
  process (which I assume is automatic). I doubt that any
  automatic processor, however sophisticated, can choose the
  correct form of quotation mark in all possible cases, if only
  the neutral 0022 mark is used in the character data. A better
  approach in applications where a high typographical quality is
  expected is to ban the indiscriminate use of 0022. Instead I
  think it is better in many applications that the word
  processor or other text input software guesses the correct
  quotation mark (of 2018 - 201F) based on all available
  information at input time, and immediately displays this mark
  on the screen. If the user isn't satisfied with what he/she
  sees, he can directly choose a better quotation mark. When the
  text is stored in a file, the 10646 quotation character
  actually chosen is included in the file.

RESPONSE : Rejected; we believe it is.

17.    > -- When a 002D HYPHEN-MINUS "-" character is
  encountered,
  > a composition and layout process may have to determine if
  > it is used in a math formula, as a separator between
  > figures (digits), as a separator between words, or as a
  > separator between syllables. Depending on which context
  > applies, it will select a minus sign, a figure dash, a
  > quotation dash, or a hyphen dash (or possibly a hyphen
  > point) glyph to display the character.

  This example faces the same kind of criticism. Don't rely on
  automatic interpretation of your intentions, instead include
  the correct character in the text when writing it. Automatic
  choice of dash/minus/hyphen form in the composition and layout
  process may be necessary in some cases, but it should not be
  held up as the only way of handling this problem in the
  report.

  Better examples of the general thesis are perhaps:
  ·    the choice of final or non-final form of "s" in Fraktur text
  ·    the choice of relative size of capital letters to the small
     letters depending on whether the text is written in German or not
·    the distribution of white space between the words on a
justified line of text (where SP characters are used in the
character coded data).

RESPONSE : Rejected; we disagree.

18.    > In addition, Arabic topography makes extensive use of
  ligatures.

  This should be "Arabic typography" I suppose.

RESPONSE : The spelling error will be fixed.

19.    I will not comment on the content of the annexes this
  time, other than observing that they are all labeled
  "(Informative)". But isn't the technical report as a whole
  informative? _Can_ it be normative?

RESPONSE : The “Informative” label will be removed from the annex
headings.

Takayuki Sato

                                        Monday, November 25, 1996
                                            and December 17, 1996

1.About the scope of the TR character/glyph model,
  I like to see small text about character recognition
  technology.

  What I like to see is:
  “This TR, as of first edition, is focusing on presentation of
  characters (such as from character code
  to glyph ID and then font).  But this TR is not well covering
  application on character recognition
  (direction is reverse, picture to glyph ID then character
  code).

  If we want to cover “character recognition” in this TR now, it
  would cause a delay of the project.  So
  I recommend, like CJK issue, rather than delaying the project,
  to proceed as it is now.

RESPONSE : A note will be added to the scope to indicate that the
TR does not currently address character recognition technology.

1. Annex E, E.3 2nd paragraph discussing about composing Han.
Remove this paragraph, and do not mention this application now.
This is very sensitive issue with SC2WG2. To avoid misleading
discussion at the WG2, it is not wise to include this text now.
(The text contents is not problem though).

RESPONSE : We will remove the paragraph.