From: Jim Allan (email@example.com)
Date: Fri Mar 26 2004 - 16:33:58 EST
Arcane Jill posted:
> (A) A proposed character will be rejected if its glyph is identical in
> appearance to that of an extant glyph, regardless of its semantic
Unicode encodes characters not glyphs. That particular glyphs of one
character are normally indistinguishable from particular glyphs of
another character (though perhaps in a different style) does not mean
that the characters themselves would be usefully unified.
Examples from the recent past are the deunification of Coptic from Greek
and the introduction of numerous Latin alphabet letter forms in various
styles as mathematical characters.
> (B) A proposed character will be rejected if its semantic meaning is
> identical to that of an extant character, regardless of the appearance
> of its glyph,
For example, that a proposed character has the approximate semantic
value of IPA _b_ doesn't mean that it should be taken as just a
variant glyph of IPA _b_ and coded as U+0062. By that rule a large
number of uncoded scripts could be easily coded by assigning the glyphs
to encoded glyphs of approximately the same meaning and using a font
change to render the script.
But changing to a different script by a font change (as opposed to a
different style of the same script) is not Unicode philosophy except in
the case of cipher character sets.
> (C) A proposed character will be rejected if either (A) or (B) are true
A redundant suggestion.
However if both (A) *and* (B) were true there would be less likelihood
that a new encoded character would be of value, especially if users are
already *happily* using a character already coded in Unicode.
However if the normal glyphs of a proposed new character were mostly
identical to normal glyphs of an already encoded character and the
proposed new character also had meanings associated with it which mostly
corresponded to the meanings associated to the same already encoded
character then it is quite likely that there would be seen to be no need
to encode the proposed new character.
But even that would not be a rule.
If, for example, in a particular script that has yet to be encoded it
chanced that the character used for the normal sound indicated by IPA
_b_ actually looked like Latter letter _b_, it would still likely be
encoded as part of that script.
The separate encoding of Coptic characters is one precedent not forced
by compatibility with previous character encodings.
By another precedent, in the case of punctuation characters and
diacritical marks similarity of form with already encoded characters
bears more weight than it does with non-punctuation characters and
> (D) None of the above
Though of course these are points that would be considered in coming to
There is a debated area here, which comes to the fore on occasion, for
example in regards to old Semitic scripts and whether particular Semitic
scripts should be lumped together or distinguished by separate encodings.
When the question of unifying or distinguishing between characters is
considered, it seems to me that the most important question is how
confusing or useful it would be to unify or distinguish between those
particular characters from the point of view of current users or
Unicode should do what is most useful.
Honest debate does arise, because what is useful in one sphere or from
one point of view may cause problems in another sphere or from another
point of view. Sometimes there is no definite correct answer.
This archive was generated by hypermail 2.1.5 : Fri Mar 26 2004 - 14:27:58 EST