Date: Fri Jun 27 2003 - 05:22:28 EDT
Kenneth Whistler wrote on 06/26/2003 05:36:34 PM:
> Why is making use of the existing behavior of existing characters
> a "groanable kludge", if it has the desired effect and makes
> the required distinctions in text?
Why is it a kludge to insert some cc=0 control character into the text for
the sole purpose of preventing reordering during canonical ordering of two
combining marks that do interact typographically and so should but
nevertheless do not have the same combining class; and, moreover, to do so
using a control character that was not created for that purpose?
The answer seems so obvious, I wouldn't know how to begin responding.
And the fact that it achieves some desired effect has no bearing on being
described as a kludge -- every kludge achieves some desired effect. If it
were otherwise, the given practice would never have been conceived.
> But in the 10646 WG2 context, coming in with a duplicate set
> of Hebrew points is not going to make any sense...
> You can always come in
> with the proposal to encode BIBLICAL HEBREW POINT PATAH and
> say, even though the glyph is identical, see, the name is
> different, so the character is different. But this is a pretty
> thin disguise, and is vulnerable to simple questioning:
> What is it for?
Are we saying that ISO doesn't give a rip for implementation issues? Or
that their notion of ordering distinctions is different from Unicode's
such that *any* differently ordering permutation of some given set of
characters is considered a distinct representation? Are we saying that the
voting members of WG2 are not already aware of the issue that has been
discussed and incapable of understanding an explanation of these issues
addressed to them?
> I'm trying to find a way, using existing characters and a
> simple set of text representational conventions, to make
> the distinctions and preserve the order relations that you
> need for decent font lookup, without the whole enterprise
> washing up on either of those two rocks.
Understood. I wasn't expecting the surf to go off in this direction since
I was under the impression when we discussed this back in December on
unicoRe that there was a consensus that we should pursue just exactly what
I wrote in the proposal.
If we want to insert a control character to prevent reordering under
canonical ordering, I think it would be preferable to create a new control
character for just that purpose: that would give a character that could be
used elsewhere for the very same purpose without needing to worry about
what unanticipated and undesirable effects might result by hijacking a
control created for some completely unrelated purpose. For instance, you
suggested RLM. Suppose next week we discover a very similar issue in a LTR
script; do we want to insert RLM to prevent mark reordering in that case?
No! Do we want to be telling people to pick and choose from various
controls, using different ones according to the directionality of the
text? What if the base character is a neutral, or has selectable
directionality (I'm thinking ahead to Tifinagh, which is written either
LTR, or RTL)? Are we also going to introduce the use of PDF for this
purpose in some contexts? How complicated to we want to make this? (Every
time we conflate distinct functions on a single control character, we are
inviting added complication, and are setting ourselves up for regrets. One
might think that lesson was learned from the conflation of ZWNBSP and BOM.
Non-Roman Script Initiative, SIL International
7500 W. Camp Wisdom Rd., Dallas, TX 75236, USA
Tel: +1 972 708 7485
This archive was generated by hypermail 2.1.5 : Fri Jun 27 2003 - 06:14:56 EDT