Re: Missing capital H from Unicode range (see 1E96)

From: Philippe Verdy (
Date: Tue Aug 16 2005 - 10:59:36 CDT

  • Next message: Antoine Leca: "Re: Portuguese (Brazil) and Portuguese (Portugal)"

    From: "Andreas Prilop" <>
    > On Mon, 15 Aug 2005, I wrote:
    >> The ISO and DIN transliteration of Arabic U+062E is U+1E2B
    >> "h with breve below", which has an upper-case form U+1E2A.
    > Correction:
    > DIN 31635 (1982) and ISO/R 233 (1961) use "h with breve below",
    > which is the "orientalists' system".
    > ISO 233 (1984) and ISO 233-2 (1993) use "h with line below".
    > Sorry for the confusion.
    > However, in academic usage, I find either the orientalists' system
    > "h with breve below" or the Anglo-American system "kh" (with or
    > without underlining *both* letters).
    > This might be an explanation why there is no precomposed
    > "capital H with line below".

    So effectively, Unicode/ISO 10646 lacks a precomposed character to support
    capital H with line (or macron?) below used in ISO 233 and ISO 233-2. They
    have been forgotten when ISO 1646 was created...

    What is the status of ISO 233 and ISO 233-2? Given the publication dates
    they may be more accurate than DIN 31635 and ISO/R 233 used by
    "orientalists". So who is supposed to use the two other standards?

    And for round-trip compatibility, there's no solution for now, unless
    Unicode adds either the encoding of a compatibility character (unlikely I
    suppose, given the current policy), or at least a "named sequence", to
    suggest that fonts should be designed to include this single letter

    Given the Anglo-American usage (which may be coded with the double diacritic
    U+035F coded between k and h), I think that there may even exist variants
    with a joining breve below [or above] both kh letters (the double diacritic
    U+035C [or U+0361] encoded between k and h), notably in texts where
    underlining may be confusive with other usages, and where not noting the
    cluster may also be confusive. As these uses are academic, any of those
    notations may be introduced when precision is needed (after all the Latin
    alphabet is foreign to the Arabic script, and using Latin is just a way to
    discuss of Arabic system in European languages, so this is just a notation
    for easier understanding, like is the IPA notation system, not something for
    general linguisitic use).

    This archive was generated by hypermail 2.1.5 : Tue Aug 16 2005 - 11:01:39 CDT