From: John Cowan (cowan@mercury.ccil.org)
Date: Wed Aug 06 2003 - 22:05:49 EDT
Kenneth Whistler scripsit:
> Is a right-to-left script encoded in visual order in
> the backing store or in phonetic (= logical) order?
I've always thought this term "visual order" was productive of
nothing but confusion. I realize that there's precedent in the
8859-x RFCs for its use, but what it really means is:
Is a right-to-left script encoded from left to right in
the backing store or from right to left?
So put, the question answers itself.
> This should, IMO, be put up on a pedestal and have the spotlights
> shined on it. This is the *fundamental* obligation of a character
> encoding standard. If you cannot accomplish this, then you just
> have a bunch of charts full of pretty pictures, and everyone is
> on their own for trying to figure out how to communicate with
> anybody else using them.
And AFAIK no other character encoding standard satisfies that obligation.
The rest of them are all charts with pictures, names, and numbers.
> The reason why the UTC should tackle the encoding of Tengwar
> is not so much because it would help in the publication of Elvish
> poetry, but because confronting the architectural issues
> it poses for encoding would make an excellent tutorial case
> for how the two principles of combining mark order and
> logical order impact the task of coming up with an appropriate
> encoding for a complex script.
Indeed.
> And it would starkly illustrate
> the fact that an appropriate character encoding does not
> necessarily directly reflect the phonological structure of
> a language as represented by that script.
"Not necessarily" is the operative word. The question is whether that
failure to reflect is tolerable. At present, three possibilities have
been kicked about:
1) Encode the vowel signs as combining characters, and therefore after
the base characters over/under which they appear, logical order be
damned.
2) Encode the vowel signs as base characters explicitly ligatured with
ZWJ to the characters over/under which they appear, in logical order.
3) Encode the vowel signs as base characters implicitly ligatured by
the font ligaturing table to the base characters over/under which they
appear, in logical order. This alternative requires the use of
distinct ligaturing tables (perhaps distinct fonts) for different
modes of use.
-- You are a child of the universe no less John Cowan than the trees and all other acyclic http://www.reutershealth.com graphs; you have a right to be here. http://www.ccil.org/~cowan --DeXiderata by Sean McGrath jcowan@reutershealth.com
This archive was generated by hypermail 2.1.5 : Wed Aug 06 2003 - 22:49:08 EDT