Re: Query for Validity of Thai Sequence

From: Lokesh Joshi (lokeshjoshi@gmail.com)
Date: Fri Feb 16 2007 - 13:28:34 CST

  • Next message: Rick McGowan: "Public Review Issues: Several updates"

    #5589: Thai layout broken for <0E25, 0E37, 0E4C>?
    -------------------------------+--------------------------------------------
     Reporter: markus | Owner: eric
         Type: defect | Status: assigned
     Priority: major | Milestone: UNSCH
     Component: layout | Version:
    Resolution: | Keywords:
         Xref: 2382, 2386, 4740 | Java:
           Os: | Project: ICU4C
        Weeks: 1 | Revw:
    -------------------------------+--------------------------------------------
    Changes (by eric):

     * status: new => assigned
     * weeks: => 1
     * xref: => 2382, 2386, 4740

    Comment:

     Suwit Srivilairith, my contact at IBM Thailand, confirms that the sequence
     is illegal according to WTT 2.0, which is a Thai national standard. Other
     posters on the Unicode list thread point out, however, that the strict
     checking of WTT 2.0 makes it impossible to write Pali and Sanskrit using
     the Thai script. Another poster suggested that the strict checking of WTT
     2.0 should more correctly be done in a spell checker rather than in the
     rendering engine, which makes sense to me.

     Here's the Microsoft Thai OT spec. section about invalid combining marks:
     http://www.microsoft.com/typography/otfntdev/thaiot/shaping.aspx

     Therefore, I'm inclined to think that more relaxed checking is in order.
     This, unfortunately, will require a redesign of the Thai engine. Perhaps
     this should be done in conjunction w/ the work for Thai and Lao OpenType
     processing, and perhaps the generic mark checking. (Tickets 2382, 2386,
     4740)

    On 2/16/07, Philippe Verdy <verdy_p@wanadoo.fr> wrote:
    >
    > On 2/16/07, Richard Wordingham <richard.wordingham@ntlworld.com> wrote:
    > > Philippe Verdy wrote on Thursday, February 15, 2007 10:28 PM
    > > > Regarding the question of the validity of Thai sequences, the
    > following
    > > > specification of the Thai support in OpenType (here the HTML version
    > > > available on Microsoft Typography website) is worth noting:
    > > > http://www.microsoft.com/typography/otfntdev/thaiot/shaping.aspx
    > > (...)
    > > Peter Constable of Microsoft is well aware of these problems and was
    > > endeavouring to ensure there would be no such problems in Windows Vista.
    > >
    > > I don't know whether Microsoft have dealt with the problem the old rules
    > > imposed for Pali and Sanskrit in Lao. Of course, Pali and Sanskrit need
    > the
    > > missing consonants to be restored for Lao. I don't know how standard
    > the
    > > improper use of the unassigned code points in the Lao block is - I have
    > had
    > > some surprises looking at the Lao fonts that provide the unencoded
    > > consonants. I had expected the encoding to be basically Thai + 0x80,
    > though
    > > that can't work for Indic NYA and YA.
    >
    > Please remember that OpenType is not just used and defined by Microsoft,
    > there are also Apple, Adobe, Monotype, HP, and other wellknown font
    > designers or rendering engine providers; this concerns not only Windows
    > but
    > all OSes, and also printer manufacturers (because they may use their own
    > shaping engine, and their own page description language accepting OpenType
    > fonts.
    >
    >
    > (I'm curious about Canon, Brother and Epson because they have their
    > rewritten version of the Postscript engine in some of thir entry models,
    > to
    > avoid paying the licence of the Adobe engine; some cheaper printers are
    > using GDI interfaces to avoid including such complex engine in the
    > printer,
    > but this gets the complexity of rendering from Windows and those printers
    > are not compatible with other OSes, and sometimes even not with several
    > versions of Windows due to differences in the GDI implementation of
    > Windows
    > and no support in the driver to convert primitives of GDI absent in the
    > printer. I don't like this strategy for cheap printers, and I much prefer
    > that manufacturers adopt their own language description based on common
    > industry standards rather than basing their design on a single OS).
    >
    >

    -- 
    Contact me AT g/y/h   {The harder I work, the luckier I get ! (Failure is
    never an option)}
    


    This archive was generated by hypermail 2.1.5 : Fri Feb 16 2007 - 13:30:24 CST