Automatic transliteration between scripts -(was: Re: ISO 10646 compliance and EU law)

From: Christopher Fynn (cfynn@gmx.net)
Date: Thu Jan 06 2005 - 08:40:12 CST

  • Next message: Theodore H. Smith: "Unicode and Levenshtein?"

    Hi Philipp

    It is *theoretically* possible to build a font which let you type in
    Wylie transliteration and have lookups in the font to display Tibetan,
    though you would probably need a very large lookup table. The main
    difficulty is that Open Type shaping engines like Uniscribe apply
    particular sub-sets of OpenType features to particular ranges of
    characters on a script-by-script basis so you would have to use only
    those features which the layout engine applied to Latin script in order
    to get it to display Tibetan text. Even if that worked you might not get
    correct line breaking, word selection and other behaviour in your
    displayed Tibetan text.

    It much easier to get such conversion to work in VOLT itself than it
    would be to get it working in applications as VOLT will let you apply
    any OT lookups to glyphs for characters in any Unicode range - and it
    does not rely on Uniscribe.

    Also if this conversion feature was on by default your font would be
    useless for anything but displaying Tibetan text - in which case why not
    use Tibetan characters in the first place? If it was a feature that
    could be turned on and of by the user you would need some kind of
    interface in applications that alowed you to turn the feature on or off
    for selected ranges of text.

    Apple's AAT/ATUSI font format has a specific Translitteration feature to
    translitterate between different script pairs like this - though
    Latin to Tibetan is not yet supported in the spec. If there is a real
    need to do something like this with OpenType I think the best way of
    proceeding would be to try and get a similar feature included in the
    OpenType spec rather than attempting to do it by overloading existing
    features.

    Personally I feel it far better to store Tibetan text using Tibetan
    characters rather than in roman transliteration. A feature that worked
    the other way round (allowing you to display Tibetan script as latin
    transliteration) might be useful for those who can't read Tibetan
    script. However since Wylie transliteration includes all the
    unpronounced Tibetan prefixes and so on, it isn't much more
    comprehensible than Tibetan script itself is to those who don't already
    know how to read the script.

    [BTW I was the one who first showed Gregor how to use VOLT ;-)]

    regards

    - Chris

    Philipp Reichmuth wrote:

    > cfynn@gmx.net schrieb am 06.01.05 05:25:00:
    >
    >>Gregor Verhufen's old Tibetan fonts did *not* use an "underlying
    >>Wylie" representation for Tibetan data or glyphs. They use a
    >>"font-hack" encoding where Tbetan glyps are mapped to Windows ANSI
    >>code-page characters - but there is no real relationship between the
    >>characters they are mapped to and Wylie translitteration.
    >
    >
    > I'm not referring to the older fonts he distributed; it's true what you
    > say about them. I was referring to some experiments here at the institute
    > shortly after VOLT became available. (I know Gregor quite well; actually,
    > it was him who got me started on OpenType at all ;) It wasn't really a big
    > thing, and I don't know if he ever distributed any of it, so it isn't really
    > much to talk about.
    >
    > Philipp



    This archive was generated by hypermail 2.1.5 : Thu Jan 06 2005 - 08:44:03 CST