From: Asmus Freytag (firstname.lastname@example.org)
Date: Sun Apr 03 2011 - 23:42:16 CDT
On 4/3/2011 3:32 PM, Peter Constable wrote:
> From: email@example.com [mailto:firstname.lastname@example.org] On Behalf Of Michael Everson
>>> In case not, I consider encoding of characters for hatches to be an extremely bad idea: these are not characters but graphic fills, of which there are a vast number.
>> They are merely an extension of a small set of such fills, which have
>> been standardized for centuries. It is a mistake to suggest that this is
>> an endless set. The specific fill patterns and their meanings are in fact
>> well-defined within a number of traditions, as shown in the proposal.
> The existing characters came from an age of non-graphic / character-only computer displays and are encoded purely for legacy reasons, not because it today's world of graphical user interfaces it's a good idea to encode them as graphic characters. In WG2, we recently saw a preliminary proposal from China in which they proposed some line-drawing characters, and it was explained to them why the existing characters are encoded and why adding new characters was not a good idea, and they understood that explanation and agreed....
It's clear from this exchange that not all participants agree on the
status of the existing characters and what that means for using them as
a precedent for additional encoding.
Therefore, this requires a bit longer reply, so bear with me.
In understanding this argument, it really helps being familiar with
character display before the advent of graphical user interfaces. And of
course, having seen and worked with the original implementations of
these Asian legacy sets also doesn't hurt.
There are literally hundreds of characters encoded in Unicode which
exist solely for the purpose of allowing a Unicode based implementation
to 'pretend' to be a legacy implementation by mapping oddball legacy
character codes to unique Unicode values.
When the transition was first made from using the legacy character sets
themselves to their "virtual" representation using Unicode and lossless
character code mapping, the presence of these compatibility characters
Now that technology has moved on, it is doubtful that these characters
are even used any longer, because the legacy implementations downstream
with their character mode displays were probably long replaced by
graphical UIs implemented directly in Unicode.
This does not mean that such characters should be deprecated - it's
simply unknown whether they are being used somewhere or whether legacy
data exists that contains them. However, it does mean that their
presence is no longer a "productive" precedent.
What is a non-productive precedent?
A non-productive precedent is one where for historical and other reasons
some characters, while still present in Unicode are considered
"mistakes" or "exceptions", and therefore, there is broad support for
the idea of not adding more characters like them.
So why isn't there a list of characters that are not precedents?
There are several reasons. Many characters were considered to be encoded
for "compatibility" when Unicode was created (or when they were added
later to Unicode). Over time, some of these characters were then found
to have very active use in certain environments or to be
indistinguishable from characters that would have to be otherwise
encoded again to cover certain needs.
Also, in a process of discovery over the two decades of Unicode's
existence, the exact boundaries implied by the character-glyph model
have had to be adjusted (in a limited number of cases) to better match
the rather messy real world of character usage.
Essentially, this boils down to the fact that whether a character is a
suitable precedent for adding other, similar, characters is not a
question of black and white, but can take on any value from "definitely
not" to "very definitely".
In this case, the situation is that the KSC X1001 characters are
"definitely not" precedents. If it can be shown that these characters
have found other uses, that are unrelated to their identity as KSC
compatibility characters, then that would affect their value as precedents.
For that to be shown, evidence needs to exist of documents /
implementations that use these as Unicode characters. (Not merely
similar looking graphic elements in text).
The proposal does not give such evidence, therefore, for these
characters, the default reply for proposals to extend the set would
remain: not to extend this set, because these characters are "definitely
not" considered precedents (given their particular encoding history).
The statement "these characters are a modest extension to an existing
set" is therefore misleading - the existing characters are of such
different origin and different use that they cannot be considered a
partial encoding of the proposed characters.
Secondly, the proposal doesn't give evidence that any of the characters
are used in plain-text today or that such usage is urgently needed.
(Aside: I think Unicode and WG2 should formally recognize that "urgently
needed" has been a valid reason for encoding characters for a while now,
and not only for currency symbols - the claim that only "existing use"
is a valid rationale is not backed by the facts).
If evidence of use as *characters* or the presence of "urgent needs" to
a significant user community could be determined, then that would be
sufficient to consider the entire set of characters on their merits. The
unifications with existing characters would then be made as appropriate.
This archive was generated by hypermail 2.1.5 : Sun Apr 03 2011 - 23:47:42 CDT