Re: Dutch IJ character

From: Peter_Constable@sil.org
Date: Mon Apr 28 2003 - 17:48:25 EDT

Next message: Mark Davis: "Unicode 4.0 in ICU demos"

Previous message: John Hudson: "Re: [OT] multilingual support in MS products (was Re: Kurdish ghayn)"
Next in thread: Doug Ewell: "Re: Dutch IJ character"
Maybe reply: Doug Ewell: "Re: Dutch IJ character"
Maybe reply: Markus Scherer: "Re: Dutch IJ character"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

Thomas Milo wrote on 04/28/2003 04:11:05 PM:

> I am not fully convinced IJ should be treated as digraph. The glitch is
that
> it capitalizes as a whole

As has been mentioned by others, that can be handled with a sequence < i, j
> in the algorithms apps use for case mapping, provided the app knows that
the text being processed is Dutch.

, and that older users try to emulate it with Y.
> And, it cannot be broken apart so that ICE CREAM on a corner shop is
>
> IJ
> S ...

For a sequence < i, j >, that is a matter of using a language-specific
tailoring for detecting text-element boundaries; again, apps can handle
this if the developers know of the need and they simply choose to do so
(and, again, assuming that the apps know that the text is Dutch).

> And, the telephone directories put IJ and Y in the same sorting position.

That is very simple handled; indeed, surely a lot of software already do
this for sequences < i, j >, and all that's needed is to make sure those
implementations do the same for U+0133.

In all of these things, Dutch "ij" is not significantly different from
digraphs from any number of other languages, such as "ch" for Slovak etc,
and such as "gb" and "mb" and "nd" for Banda and Gbaya and dozens (if not
hundreds) of other African languages.

> Absence of support for these features are a daily nuisance and lead to a
> visible deterioration in printed materials. Which technology (Unicode,
> OpenType, AAT/ATSUI, Graphite) is used to bring these features back is
> irrelevant to the Dutch user.

True, but insisting that the right solution is to start creating input
methods that generate U+0133 won't alone work since there is a lot of
existing data that uses < i, j >, and there will continue to be
implementations that generate new data using < i, j >. The only solution
has to be a comprehensive one that solves the problems using either a
sequence < i, j > or the single character U+0133, treating the two equally
well and, for most practical purposes, as effectively the same thing.

- Peter

---------------------------------------------------------------------------
Peter Constable

Non-Roman Script Initiative, SIL International
7500 W. Camp Wisdom Rd., Dallas, TX 75236, USA
Tel: +1 972 708 7485

Next message: Mark Davis: "Unicode 4.0 in ICU demos"
Previous message: John Hudson: "Re: [OT] multilingual support in MS products (was Re: Kurdish ghayn)"
Next in thread: Doug Ewell: "Re: Dutch IJ character"
Maybe reply: Doug Ewell: "Re: Dutch IJ character"
Maybe reply: Markus Scherer: "Re: Dutch IJ character"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.5 : Mon Apr 28 2003 - 18:33:30 EDT