Re: ISO-15924 script nodes and UAX#24 script IDs

From: Antoine Leca (
Date: Tue May 18 2004 - 08:32:10 CDT

  • Next message: Michael Everson: "Re: Vertical BIDI"

    Philippe Verdy wrote on Tuesday, May 18th, 2004 12:24:
    > Also there are differences in orthographs in the table lists:
    > the plain text version and Table 2 use consonnants with dot
    > below for the english name, but Table 1 use basic Latin
    > consonnants (example for Malalayam).

    I believe these are typos that you ought to specify exhaustively to Michael
    to have then corrected.

    It looks like to me that all the diacritics would have to be dropped in
    English, and that a number of them escaped the net...

    > Dots below are probably appropriate for the French name,
    > not for the English one.


    French usage has always been to "morph" the original name to suit French
    orthographic rules.
    OTOH, it appears to me (feel free to contradict me, and also to to point me
    the epoch when these things did change) that English habits now is to follow
    the native name and the translitteration rules. A good example I found
    recently is the name of Cervantes' main work, which short name is "Don
    Quixote" in English, the same as it was in (original) Castilian, while at
    the same time it was adapted in French as "Don Quichotte" (same
    prononciation as original), and similarly in today's Castilian "Don Quijote"
    (with subsequent change in prononciation.) I do not know how English natives
    will pronounce it, however.

    Another point is that the reference work about scripts in French are for a
    good part old-fashioned, while at the same time recent English references
    seem to abound. I may be biaised here (I surely am, in fact), but it appears
    to me to represent a certain evolution in the world use of languages in
    scientific works along the last century...

    As a result, when we build the tables for 15924, we choose to have the
    French name to represent the widely used practices, with the obvious
    conventions (like ^ for the lengthned vowels) and some long-used ones, like
    for s in Indian scripts (but ch for ? since it fits the need well.) But we
    do avoid all the "strange" characters. The case of ? in Malaya?am (or O?iya)
    is exemplary : the sound does not exist in French, and about no Frenchies
    will know how to say it correctly. Furthermore, I highly doubt that the most
    immediate feeling of a litterate French when he sees subscripted dots would
    be to imagine the retroflex feature this convention implies... So we
    followed current practices and droped all the subscripted dots.

    We do keep a number of strange spellings for the alternate variants (between
    parenthesis), particularly when usage was not fixed (I particularly record
    about Cham this about.)


    This archive was generated by hypermail 2.1.5 : Tue May 18 2004 - 08:35:53 CDT