From: Asmus Freytag (firstname.lastname@example.org)
Date: Mon Sep 07 2009 - 18:59:10 CDT
On 9/7/2009 5:50 AM, verdy_p wrote:
>> De : "Shriramana Sharma"
>> A : "email@example.com"
>> Copie à :
>> Objet : Request clarification on disunification based on different character properties
>> Hello. Again the disunification question. P 29 of the P&P document:
>> If a character disunification cannot be achieved by adding one
>> new character without requiring a change in very significant properties
>> of the existing character and without changing the representative glyph
>> or range of expected glyphs for the existing character, then new
>> characters will be added for each of the distinct, specific letterforms
To that end Philippe proposes:
> "If a character unification cannot be maintained without changing very significant properties of the existing
> character and without changing the representative glyph or range of expected glyphs for the existing character, then
> new characters will be added for each of the distinct, specific letterforms required."
Which is an entirely different statement.
First, the change in context from "dis-"unification (an event, triggered
by a proposal) to "maintaining unification" (s state).
In the P&P, the context is always that of a proposal that has been
submitted that would ask for some change in encoding. In this case, it
would ask for a new character to cover some textual entity that was,
heretofore, encoded with an existing character. The standard example of
that situation is the character "HYPHEN-MINUS" which has been (and is)
used for both hyphen and minus sign (as well as some other dash-like
entities which we'll ignore here).
The principle states what to do when someone comes and asks for a
specific character that only means "MINUS" and looks a bit different.
If this request is found acceptable, then, the principle states, it's
not enough to just code a "MINUS", but it's also necessary to code a
The rationale for that is that by adding both new characters, the
existing character can be used (as before) in an ambiguous manner. If,
instead, only a "MINUS" was added, then users that wanted to contrast
minus sign and hyphen would need to use the ambiguous character as if it
was exclusively a hyphen. That would change the nature (read very
significant properties) of that character in a way that the P&P finds
If the proposal, instead had been to add just a HYPHEN, then there would
have been pressure to treat the formerly ambiguous character as a MINUS,
including perhaps, to have common implementations change its glyph over
time. So that's equally objectionable.
The P&P states, in essence, that a accepting a proposal may not result
in changing the *identity* of any existing character - something that's
prohibited by the character encoding stability policy. Disunification is
allowed if it can be carried out (for example by encoding several new
characters) in a way that preserves the (essential) interpretation of
the existing character.
Here, now we come to a new wrinkle. Sometimes character X is used only
occasionally for purpose Y, usually because the real character isn't
encoded and people make do. In that case, you can argue, the (essential)
identity of X doesn't actually contain any aspect of Y, or so little as
to not be predominant. In that case, just coding Y is acceptable.
Making a judgment on whether this is a case that requires both new
characters or only one is not something that proceeds from algorithmic
application of rules. That's why the term "significant" is not further
defined, beyond its ordinary use in the English language.
One final observation. By rejecting any and all character proposals,
it's always possible to maintain, indefinitely, any existing "character
unification", for whatever reason. Therefore, the P&P does not talk
about maintaining unifications, but how to deal with requests that would
result in disunifications.
We've now seen why suggestions for revision of the P&P text are best
approached very cautiously, and only after a clear understanding of the
impact of such changes. The document distills over a decade of
experience of major participants in the character encoding effort, and
it is primarily written for an audience of experts (in other words,
delegates to WG2) to help ground their decisions in well-understood
precedents. It's not a cookbook for deciding character encoding
questions by rote.
Proposals should focus on making a case for a particular encoding change
on their own merits, not by arguing chapter and verse from the P&P.
This archive was generated by hypermail 2.1.5 : Mon Sep 07 2009 - 19:01:56 CDT