Re: Ligatures in Turkish and Azeri, was: Accented ij ligatures

From: Philippe Verdy (verdy_p@wanadoo.fr)
Date: Thu Jul 10 2003 - 11:21:36 EDT

  • Next message: Peter Kirk: "Re: Ligatures in Turkish and Azeri, was: Accented ij ligatures"

    On Thursday, July 10, 2003 12:08 PM, Peter Kirk <peter.r.kirk@ntlworld.com> wrote:

    > On 1st July Philippe Verdy wrote:
    >
    > > If fonts still want to display dots on these characters, that's a
    > > rendering problem: there already exists a lot of fonts used for
    > > languages other than Turkish and Azeri, which do not display any
    > > dot on a lowercase ASCII i or j (dotted), and display a dot on their
    > > uppercase ASCII versions (normally not dotted with classic fonts)...
    > >
    > > The absence or presence of these dots is then seen as "decorative"
    > > even if these fonts are not suitable for Turkish and Azeri, but
    > > this is clearly not an encoding problem in the Unicode encoded text,
    > > and not a problem either for case conversions.
    > >
    >
    > Turkish and Azeri do not use the ij ligature. The sequences i - j and
    > dotless i - j do occur (rarely, as j is a rare letter in both
    > languages) but are treated as separate letters.

    I know, and the quoted paragraph did not speak about the ij ligature
    but effectively about the separate dotted/dotless i/I letters, for which
    "decorated" fonts where the lowercase ASCII (dotted) i codepoint
    uses a dotless glyph, or the uppercase ASCII (dotless) I codepoint
    uses a dotted glyph (some fonts are ligating the dot with decorative
    curves). These fonts are effectively not suitable for Turkish and
    Azeri.

    > In Turkish and Azeri the sequences f - i and f - dotless i both occur,
    > and are fairly frequent. So it is inappropriate in these languages to
    > use fi ligatures in which the dot on the i is lost or invisible, at
    > least where the second character is a dotted i. Has any thought been
    > given to this issue? Is it possible to block such ligation on a
    > language-dependent basis?

    Isn't there a "Grapheme Disjoiner" format control character to force the
    absence of a ligature like <fi>, i.e. <f, GDJ, i>?

    > Also it is certainly possible that in dictionaries etc in these
    > languages stress might be marked by an accent on the vowel - as
    > certainly in the older Cyrillic Azeri just as in Bulgarian as just
    > posted. In this case the dot should not be removed from the dotted i
    > when the stress mark is added, so that the distinction from dotless i
    > is not lost. Has that issue been addressed? (In my Latin script Azeri
    > dictionary stress is marked by a spacing grave accent before the
    > vowel, but this may have been done precisely to work around this
    > problem.)

    This is part of the proposal for review: an explicit combining dot-above
    diacritic can be inserted between the normal (soft-dotted) base letter
    and the above diacritic (with class 230):
    <latin-small-i, dot-above, accute-accent>
    <cyrillic-small-je, dot-above, grave-accent>

    -- Philippe.



    This archive was generated by hypermail 2.1.5 : Thu Jul 10 2003 - 12:26:48 EDT