I have struggled with this problem of whether Khmer should have an escape
code for subjoined consonants for some time. One of the arguments for
such an escape code is that unusual and unexpected characters on rare
occasions have a way of turning up under the base consonant (even
independent vowels, for example). This opens pandora's box in terms of
letting anything get down there.
Obviously using an escape code reduces the number of coding points of the
standard (however isn't that letting the tail wag the dog!!).
On the other hand in Cambodia there is no concept of virama transforming a
consonant into its subjoined form. This is hence an artificial construct
not inherent in the script. Furthermore I feel it complicates sorting and
searching algorithms. The weight of a subscript/subjoined consonant is
much less than the weight of a base consonant for sorting purposes.
On Thu, 16 Jan 1997 unicode@Unicode.ORG wrote:
> At 12:48 AM 1/16/97 -0800, unicode@Unicode.ORG wrote:
> > - how does UCS Tibetan avoid the alternative spellings problem outlined
> > above?
> It normalizes data (according to its definition of normal) at some well-
> defined boundary.
> > - would a UCS Tibetan imlementation be "wrong" if it allowed just base
> > characters to be input?
> There is very little that could be construed as 'wrong' in regard to
> the conformance clause of Unicode and even less so for 10646. Whether
> it would be 'wrong' in terms of giving users what they want, only the
> market can decide.
> Glenn Adams
This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:33 EDT