RE: Unicode Transliteration Guidelines released

From: William J Poser (
Date: Sun Jan 27 2008 - 19:46:45 CST

  • Next message: Stephane Bortzmeyer: "Re: Questionable definition of Unicode"

    I agree that I find it very odd for Unicode to be promulgating
    transliterations, since an appropriate transliteration is not
    only specific to a pair of languages but depends on the purpose
    for which it is intended.

    There are, however, uses for ascii transliterations even with the
    advent of Unicode. I have had to create and implement several such
    for the Linguistic Data Consortium. One reason for using them
    is that sometimes people want to use existing software that cannot
    handle Unicode, so you need to ascify the text, run it through,
    and then convert it back. For this purpose, the transliteration can
    be pretty arbitrary so long as it is reversible. Indeed, some people
    here have used a slightly modified form of the Unicode character names
    as the ascii transliteration. It is long-winded, but the computers
    do the work and they don't seem to mind.

    Another reason for using an ascii transliteration is when you've
    got computational linguists working on a language that they don't
    know well, whose writing system they cannot easily work with.
    In this case, you want the transliteration to be less arbitrary
    and to give some idea of the pronounciation so that they can talk
    to themselves and each other about the data (suppose, for example,
    they've got to write a morphological analyzer).


    This archive was generated by hypermail 2.1.5 : Sun Jan 27 2008 - 19:48:53 CST