Re: Query for Validity of Thai Sequence

From: John Hudson (john@tiro.ca)
Date: Fri Feb 09 2007 - 00:09:29 CST

  • Next message: Eric Muller: "Re: Query for Validity of Thai Sequence"

    Lokesh Joshi wrote:

    > thanx... When i went thru the Thai Language rules, the state diagram
    > showed that this sequence may be illegal. Moreover ICU shows the
    > sequence as illegal, showes CIRCLE below the THANTHAKHAT. :)

    > I have just ordered VISTA, will be trying on that today itself.

    I tested this sequence with the version of the Uniscribe Thai engine that ships with
    Vista, and it works fine, i.e. no dotted cirles.* It is important to make a distinction
    between what may be a grammatically or phonologically invalid sequence in a particular
    *language* and what may in fact be a perfectly valid combination of characters according
    the the general rules of the script. I would say that in the general rules of the Thai
    script the sequence -- a letter plus a vowel sign plus a secondary sign -- is valid even
    if it is linguistically nonsensical in the Thai or Pali languages (one of the functions of
    thanthakhat is to mark final consonants in Pali).

    Successful writing systems tend to get adapted for multiple languages, and it is dangerous
    to make assumptions about what is valid for a script based on how it is used for certain
    languages. My own view is that the only restrictions on validity of character sequences
    should be technical ones, and every attempt should be made to keep these to a minimum.
    Ideally, any combination of any marks should be applicable to any base character in any
    script, and it should be up to the font to try to figure out a sensible way to display
    such sequences. I'm wary of shaping engines trying to perform what are, in effect,
    spellchecking functions. If a Thai spellchecker wants to tell me that <0E25, 0E37, 0E4C>
    is invalid, that's fine, but if a Thai script engine won't let me display it, then I think
    that is a flaw in the engine.

    John Hudson

    * This didn't surprise me, since Peter Constable at MS is particularly knowledgeable about
    the Thai script and sensitive to minority language issues.

    -- 
    Tiro Typeworks        www.tiro.com
    Vancouver, BC         john@tiro.ca
    Marie Antoinette was a woman whose core values were chocolate,
    sex, love, nature and Japanese ceramics. Frankly, there are
    worse principles of government than that.  - Karen Burshtein
    


    This archive was generated by hypermail 2.1.5 : Fri Feb 09 2007 - 00:13:58 CST