Re: Looking for transcription or transliteration standards latin- >arabic

From: John Cowan (jcowan@reutershealth.com)
Date: Fri Jul 09 2004 - 22:33:01 CDT

  • Next message: Mark Davis: "Re: Changing UCA primarly weights (bad idea)"

    Peter Kirk scripsit:

    > I have just reviewed this list and found it odd that Hebrew presentation
    > forms are included but Arabic ones are not.

    The specification actually called only for Latin, Greek, and Cyrillic;
    I added Hebrew pour la lagniappe. If someone wants to add Arabic, I
    encourage them to do so.

    > the Hebrew presentation forms but also most of the precomposed
    > characters are redundant in this list.

    True; however, the current list indicates the scope of what actually
    happens, even if it is overlong.

    > It is therefore
    > necessary to list in the specification of the folding only all (?)
    > combining marks, which are to be deleted,

    I believe that all Mn-class characters, and only they, are deleted by this.

    > I note that 0429 is not folded to 0428 etc, and this is correct because
    > within the Cyrillic writing system these are entirely separate
    > characters. But the difference between these two is in fact exactly the
    > same descender which is removed in 0496 etc.

    I don't think that matters. Long historical practice has made SHCHA a
    separate letter, just as G, J, U, and W are now separate Latin letters
    from C, I, V, and VV-ligature.

    > I am also surprised to note
    > that no folding is given for 0419/0439; although in some ways this is
    > desirable because Russians do not consider this breve to be a diacritic
    > (and after all we would not want the dot on i to be removed as a
    > diacritic!), these characters have canonical decompositions to 0418/0438
    > and breve and the principle of canonical equivalence and the folding
    > algorithm (which works on decomposed characters) more or less demand
    > that the breve be deleted. Also 048A/048B should then fold to 0418/0438
    > rather than 0419/0439.

    I think I agree with this: i-breve does not have the same universal status as
    shch.

    -- 
    John Cowan  www.reutershealth.com  www.ccil.org/~cowan  jcowan@reutershealth.com
    'Tis the Linux rebellion / Let coders take their place,
    The Linux-nationale / Shall Microsoft outpace,
    We can write better programs / Our CPUs won't stall,
    So raise the penguin banner of / The Linux-nationale.
    


    This archive was generated by hypermail 2.1.5 : Fri Jul 09 2004 - 19:17:16 CDT