Re: Compiling a list of Semitic transliteration characters from Philippe Verdy on 2012-09-07 (Unicode Mail List Archive)

From: Philippe Verdy <verdy_p_at_wanadoo.fr>
Date: Fri, 7 Sep 2012 15:58:48 +0200

2012/9/7 Leif Halvard Silli <xn--mlform-iua_at_xn--mlform-iua.no>:
> The word "Roman", can also refer to "Greek". So it is best to avoid
> that term. ;-)

The Roman empire was speaking a large set of languages (and writing in
various scripts) from Europe to Asia and Africa, even if Latin was
used in Rome, and written in the Latin script (but not only).

But the conventional meaning of "romanisation" is that it is a
transcription to the Latin script (independantly of the target
language).

The concept of transliteration, rather than transcription, is in fact
quite new in human history : the initial need was just to write how
languages were pronounced, with more or less approximations, to match
the way another language is written, read and pronounced (in the
target phonology). So a transcription has always been lossy.

But the real difference between transcription and transliteration is
for another role : a transliteration attempts to preserve the maximum
of the source language phonology and meaning, avoiding most
ambiguities. So a transliteration occurs within the same language. A
translietteration scheme is created when a language starts changing
its standard script in some area. But even in that case it is
extremely rare that this conversion will be lossless : there are
frequent adaptation of the orthography, and some historic
orthographies in the original script (such as mute letters or more
frequently letters whose current phonology has changed considerably so
that the original orthography in the source script is already far away
from the actually spoken language, or because some historic
distinctions are no longer heard and the transliteration scheme is
representing the letters the same way : N-to-1 is then frequent as
well).

For some pairs or scripts, it is impossible to be 1-to-1, because the
scripts work very differently : alphabets are not like abjads or
akharas, and not like ideographic scripts. So adaptation is
unavoidable. When a language changes its standard script, there is
also very frequently an orthographic reform on the new script, so even
the rules of transliterations contain a lot of new exceptions, to
match the new orthography. When this change of script is just
motivated to ease the learning of the language by people that are
better aware of another script, the transliteration rules will often
be more strict. It will be much stricter if this change of script is
motivated by technical reasons (but people are generally not very well
trained on how to make this conversion, so they will each one use
their own transliteration scheme, to approximate the language.

For this reason, the distinction between lossy and lossless is not
very relevant to make the distinction between a traditional
transcription and a "modern" transliteration. My opinion if that the
simplest conversions that try to avoid most ambiguities are just named
"transliteration" and they occur within the same language in the same
region. Transcriptions are more traditional and instead on focusing on
the source language, they try to best approximate the phonology of
another language in its current common orthography.

Different needs, different rules, but even in both cases the rules are
not followed exactly. None of them are lossless. But the distinction
is there. There's no clearly defined separation line between
transliteration schemes and transcription schemes. except by their
intent to preserve a source language or best approximate another one.

So the stadnard conversion of Chinese from Han ideographs to Bopomofo
or Latin (with the Piyin standard) could be called "translierations"
even if there's by evidence a lot of losses. Same thing about Romaji
in Japanese. And even for Korean the standard conversion from the
Hangul alphabet to Latin creates some ambiguities and is a bit lossy.

Note also that a transcription also occurs within the same script :
when you adapt an orthography to use other letters than in the
original orthography, this is not a transliteration. For example when
you transcript French to an English context, you'll commonly convert
"ou" into "oo", or will disambiguate some "s" into "z", or some "c"
into "k" or "s". The intent is to tell English native speakers how to
read a word written in another language (e.g. you say that the French
word "paille" should be read like the English "pie". This is not a
transliteration but a transcription).

As well, when you convert the language into a phonetic alphabet like
IPA, the process is definitely not a transliteration but a
transcription, even if this occurs within the same Latin script (many
people are arguing that IPA is not part of the Latin script as it does
not contain "letters", but "symbols", and it is monocameral and it
cannot follow the common typographic rules, in addition to the fact
that it borrows symbols from Greek letters and adds new specific
symbols plus many new diacritics) !
Received on Fri Sep 07 2012 - 09:02:48 CDT

This archive was generated by hypermail 2.2.0 : Fri Sep 07 2012 - 09:02:50 CDT