Re: Accented ij ligatures (was: Unicode Public Review Issues update)

From: Philippe Verdy (verdy_p@wanadoo.fr)
Date: Tue Jul 01 2003 - 08:22:56 EDT

  • Next message: Pim Blokland: "Re: Accented ij ligatures (was: Unicode Public Review Issues update)"

    On Tuesday, July 01, 2003 1:55 PM, Kent Karlsson <kentk@cs.chalmers.se> wrote:

    > > My feeling about the proposed "Public Review" document should
    > > exclude the <ij> ligature, waiting for the decision about the new
    > > <dotless-ij> ligature approved in the first rounds by UTC and
    > > waiting for approval by ISO JTC...
    >
    > There is no proposal to add any dotless ij ligature character. Please
    > read the pipeline documents more carefully before going off imagining
    > a character not being proposed, and is unlikely to be seriously
    > proposed.

    Sorry, I should have written <dotless-j> in the last paragraph, for
    the proposed character at U+0237 (LATIN SMALL LETTER DOTLESS J)

    For me the <ij> ligature is mostly used for Dutch, and the few
    applications where <ij,accute> and <ij,macron> are used should be
    rendering them according to that language, where it is handled as a
    single letter.

    In all other cases, the <ij> ligature should be avoided, simply because
    there are other better choices with <i>/<dotless-i>/<I>/<dotted-I> and
    <j>/<J>/<proposed-dotless-j>, in combination with double diacritics
    inserted between them to produce the desired effect.

    In either cases, the "Soft_Dotted" property is probably overkill on
    the existing <ij> or <IJ> ligatures (should should have been better
    named "letters" and not "ligatures") for Dutch. Or is this update
    needed to document officially the expected rendering behavior for
    sequences <ij,accute> and <ij,macron>?

    The main interest of the Soft_Dotted property is not to describe the
    rendering for the character, but to document how case conversions
    (lowercase, uppercase, titlecase, folded) can be performed safely on
    the Unicode encoded string. I'd like to know exactly why it is needed
    for Dutch, as such a ligature is not used in Turkish and Azeri written
    with the Altaic Latin alphabet...

    If fonts still want to display dots on these characters, that's a
    rendering problem: there already exists a lot of fonts used for
    languages other than Turkish and Azeri, which do not display any
    dot on a lowercase ASCII i or j (dotted), and display a dot on their
    uppercase ASCII versions (normally not dotted with classic fonts)...

    The absence or presence of these dots is then seen as "decorative"
    even if these fonts are not suitable for Turkish and Azeri, but this is
    clearly not an encoding problem in the Unicode encoded text,
    and not a problem either for case conversions.

    The only reason that would justify adding a "Soft_Dotted" property
    on <ij> would be that it is needed to allow the correct handling
    of language-dependant case conversions.

    -- Philippe.



    This archive was generated by hypermail 2.1.5 : Tue Jul 01 2003 - 09:09:43 EDT