Re: Dutch IJ character

Date: Mon Apr 28 2003 - 17:48:25 EDT

  • Next message: Mark Davis: "Unicode 4.0 in ICU demos"

    Thomas Milo wrote on 04/28/2003 04:11:05 PM:

    > I am not fully convinced IJ should be treated as digraph. The glitch is
    > it capitalizes as a whole

    As has been mentioned by others, that can be handled with a sequence < i, j
    > in the algorithms apps use for case mapping, provided the app knows that
    the text being processed is Dutch.

    , and that older users try to emulate it with Y.
    > And, it cannot be broken apart so that ICE CREAM on a corner shop is
    > IJ
    > S ...

    For a sequence < i, j >, that is a matter of using a language-specific
    tailoring for detecting text-element boundaries; again, apps can handle
    this if the developers know of the need and they simply choose to do so
    (and, again, assuming that the apps know that the text is Dutch).

    > And, the telephone directories put IJ and Y in the same sorting position.

    That is very simple handled; indeed, surely a lot of software already do
    this for sequences < i, j >, and all that's needed is to make sure those
    implementations do the same for U+0133.

    In all of these things, Dutch "ij" is not significantly different from
    digraphs from any number of other languages, such as "ch" for Slovak etc,
    and such as "gb" and "mb" and "nd" for Banda and Gbaya and dozens (if not
    hundreds) of other African languages.

    > Absence of support for these features are a daily nuisance and lead to a
    > visible deterioration in printed materials. Which technology (Unicode,
    > OpenType, AAT/ATSUI, Graphite) is used to bring these features back is
    > irrelevant to the Dutch user.

    True, but insisting that the right solution is to start creating input
    methods that generate U+0133 won't alone work since there is a lot of
    existing data that uses < i, j >, and there will continue to be
    implementations that generate new data using < i, j >. The only solution
    has to be a comprehensive one that solves the problems using either a
    sequence < i, j > or the single character U+0133, treating the two equally
    well and, for most practical purposes, as effectively the same thing.

    - Peter

    Peter Constable

    Non-Roman Script Initiative, SIL International
    7500 W. Camp Wisdom Rd., Dallas, TX 75236, USA
    Tel: +1 972 708 7485

    This archive was generated by hypermail 2.1.5 : Mon Apr 28 2003 - 18:33:30 EDT