Re: Missing African Latin letters

From: Don Osborn (dzo@bisharat.net)
Date: Sat Dec 06 2003 - 02:46:44 EST

  • Next message: Jungshik Shin: "Re: Compression through normalization"

    Philippe, This is an interesting inventory and prompts a couple of
    questions. First, do you have a corresponding list of languages (and
    countries) for these various characters? That would be useful. Second, is
    this part of a project?

    On the characters themselves quick comments re 4 that you mention:
    * Re alpha, in what languages do you see these uses? In Latin transcription
    of Tamasheq, I believe the alpha may have been suggested as an alternative
    to the a with turned breve/caron - IOW, the usage of the alpha and one of
    these diacritical characters would be equivalent (not really an answer to
    the lack of a capital alpha when you need it)
    * Again the r-bar issue - is this to be a precomposed pair of characters
    now?
    * Re u-bar, this is apparently in the DRC by up to three languages (Lengu,
    Mangbetu, Budu) and in Cameroon reportedly by 7 (West. Ejagham, Fe'efe'e,
    Kozime, Limbum, Mekaa, Yamba, and Yemba). Having LC but not capital forms
    leaves options of composing the capital (not ideal) or adding another
    precomposed.
    * Re hooked w this was discussed on a12n-collaboration as being used in one
    language of Burkina Faso (Puguli). See, among others,
    http://lists.kabissa.org/lists/archives/public/a12n-collaboration/msg00474.html
    (the latter also has a mention of the u-bar).

    Don Osborn
    Bisharat.net

    ----- Original Message -----
    From: "Philippe Verdy" <verdy_p@wanadoo.fr>
    To: "Michael Everson" <everson@evertype.com>
    Cc: "Unicode@Unicode.Org" <unicode@unicode.org>
    Sent: Saturday, December 06, 2003 6:12 AM
    Subject: RE: Missing African Latin letters

    > Michael Everson wrote:
    > > Philippe Verdy wrote:
    > > > Some letters used in Latin transcription of Pan-Sahelian scripts
    > > > are still missing in Unicode:
    > > > Is there a proposal to include them, as they are needed for case
    > > > folding and capital transcription?
    > > > I can identify immediately these two ones that would be needed on
    > > > Pan-Sahelian keyboards:
    > > >
    > > >U+027E LATIN SMALL LETTER R WITH FISHHOOK
    > > >U+???? LATIN CAPITAL LETTER R WITH FISHHOOK
    > > >
    > > >U+0266 LATIN SMALL LETTER H WITH HOOK
    > > >U+???? LATIN CAPITAL LETTER H WITH HOOK
    > >
    > > Looking at the International Niamey keyboard layout given at
    > >
    >
    http://scripts.sil.org/cms/scripts/page.php?site_id=nrsi&item_id=IntlNiameyK
    > ybd
    > > it can be observed that, of the set of letters used, four of them do
    > > not have capital forms:
    > >
    > > U+0266 LATIN SMALL LETTER H WITH HOOK
    > > U+027E LATIN SMALL LETTER R WITH FISHHOOK
    >
    > I gave these two ones in my message.
    >
    > > U+0251 LATIN SMALL LETTER ALPHA
    >
    > (Note that the Big Alpha has a glyph often made distinct from an uppercase
    > A, using a rounded top instead of the angular shape, but some writers seem
    > to add a turned breve (or a breve or another diacritic) above a standard
    A,
    > to make it appear as Alpha and not A. Sometimes the "alpha" is marked with
    a
    > AE or ae letter (I think it comes from limited character sets, where AE
    was
    > present but there was no possible distinction for the Latin letter Alpha).
    I
    > don't know the language in which it is written, so I can't say if they are
    a
    > fallback convention for printing or a separate orthographic letter. May be
    > an Africanist will reply to this...
    >
    > > U+0273 LATIN SMALL LETTER N WITH RETROFLEX HOOK
    >
    > I did not list them, but there are other missing capital letters:
    > (outside of glottal stops which do not seem to have case
    > variation, although it may be possible that this is rendred
    > in some African texts for titles):
    >
    > 01AA;LATIN LETTER REVERSED ESH LOOP;Ll;0;L;;;;;N;;<??>;;<??>;
    > <??>;LATIN CAPITAL REVERSED ESH LOOP;Lu;0;L;;;;;N;;;01AA;;
    >
    > 01BA;LATIN SMALL LETTER ESH WITH TAIL;Ll;0;L;;;;;N;;<??>;;<??>;
    > <??>;LATIN CAPITAL LETTER ESH WITH TAIL;Lu;0;L;;;;;N;;;01BA;;
    >
    > 0271;LATIN SMALL LETTER M WITH HOOK;Ll;0;L;;;;;N;LATIN SMALL LETTER M
    > HOOK;<??>;;<??>;
    > <??>;LATIN CAPITAL LETTER M WITH HOOK;Lu;0;L;;;;;N;;;0271;;
    >
    > 0289;LATIN SMALL LETTER U BAR;Ll;0;L;;;;;N;;<??>;;<??>;
    > <??>;LATIN CAPITAL LETTER U BAR;Lu;O;L;;;;;N;;;0289;;
    >
    > 028C;LATIN SMALL LETTER TURNED V;Ll;0;L;;;;;N;;<??>;;<??>;
    > <??>;LATIN CAPITAL LETTER TURNED V;Lu;0;L;;;;;N;;;028C;;
    >
    > On the opposite, there seems also to exist x-height bilabial click (i.e. a
    > lowercase version, the encoded one being probably uppercase/noncased)...
    >
    > 0298;LATIN LETTER BILABIAL CLICK;<???Lu???>;0;L;;;;;N;LATIN LETTER
    > BULLSEYE;;<??>;;
    > <??>;LATIN SMALL LETTER BILABIAL CLICK;Ll;0;L;;;;;N;;0298;;0298;
    >
    > And of course there are the three missing letters:
    >
    > U+???? LATIN SMALL LETTER R BAR
    > U+???? LATIN CAPITAL LETTER R BAR
    >
    > U+???? LATIN SMALL LETTER W WITH HOOK
    > U+???? LATIN CAPITAL LETTER W WITH HOOK
    >
    > U+???? LATIN SMALL LETTER KHI (or X WITH HOOK?)
    > U+???? LATIN CAPITAL LETTER KHI (or X WITH HOOK?)
    >
    > And did not check all the letter variants composed with lower dots rather
    > than hooks and bars, but this second set seems complete.
    >
    >
    > __________________________________________________________________
    > << ella for Spam Control >> has removed Spam messages and set aside
    > Newsletters for me
    > You can use it too - and it's FREE! http://www.ellaforspam.com
    >



    This archive was generated by hypermail 2.1.5 : Sat Dec 06 2003 - 03:30:03 EST