Re: Dutch IJ, again

From: Mark Davis (mark.davis@jtcsv.com)
Date: Mon May 26 2003 - 11:13:34 EDT

  • Next message: Peter_Constable@sil.org: "Re: PUA glyphs in fonts (was: Is it true that Unicode is insufficient for Oriental languages?)"

    > Because there ARE words in Dutch where the combination i+j is not
    > the same as ij (e.g. "bijectie"), and I wouldn't know how to
    > formulate those situations in SpecialCasing.txt.

    But this shouldn't be an issue. There are three possible cases:

    bijectie
    BIJECTIE
    Bijectie

    You would never have "Ij" in a word like that anyway, right? (except
    something truely bizarre like "bIjEcTiE"...) So the only problem for
    Dutch comes in the titlecasing (aka initial-caps) of "ij". So the
    question then becomes:

    * Are there any Dutch words of the form "ijx" (i.e. *starting* with
    "ij") that are titlecased as "Ijx" (where x is any string of 0 or more
    characters).

    Only for those words would you need to distinguish between normal
    "IJ"/"ij" and

    U+0132 (IJ) LATIN CAPITAL LIGATURE IJ
    or
    U+0133 (ij) LATIN SMALL LIGATURE IJ

    Otherwise the rule would be that in titlecasing, uppercase any "j"
    after "I".

    Mark
    __________________________________
    http://www.macchiato.com
    ► “Eppur si muove” ◄

    ----- Original Message -----
    From: "Pim Blokland" <pblokland@planet.nl>
    To: "Mark Davis" <mark.davis@jtcsv.com>
    Sent: Monday, May 26, 2003 07:12
    Subject: Re: Dutch IJ, again

    > Mark Davis schreef:
    >
    > > > Why didn't I find a special casing rule for the *pair* of
    > > > characters "ij" with Dutch (nl) in the UCD ?
    > >
    > > You didn't find it because although various people have
    > > muttered about it in the past, nobody has yet made a
    > > formal proposal to the UTC, listing all the specific changes
    > > that would be needed for the text and data files.
    >
    > Oh... I assumed that what had happened was that the solution for the
    > casing problem was to include ij and IJ in Unicode as singular
    > codepoints.
    > Because there ARE words in Dutch where the combination i+j is not
    > the same as ij (e.g. "bijectie"), and I wouldn't know how to
    > formulate those situations in SpecialCasing.txt.
    >
    > In the current situation, we DO have correctly cased codepoints (ij
    > and IJ) if we need them, and we have the i+j where we don't need the
    > "ij" sound, and if we type ij where we should have typed ij, it's our
    > own fault, so other than muttering about awkward input methods for
    > U+0133 and fonts that display U+0133 as a square, I don't think
    > anything real will change soon.
    >
    > Pim Blokland
    >
    >



    This archive was generated by hypermail 2.1.5 : Mon May 26 2003 - 12:05:02 EDT