From: Philippe Verdy (email@example.com)
Date: Sat Nov 27 2004 - 15:12:55 CST
From: "Addison Phillips [wM]" <firstname.lastname@example.org>
> For example, Dutch sometimes treats the sequence "ij" as a single letter
> (it turns out that there are characters for the letter 'ij' in Unicode
> too, but they are for compatibility with an ancient non-Unicode character
> set). Software must be modified or tailored to provide behavior consistent
> with the specific language and context.
Not sure about that: not all Dutch "ij" letter pairs are a single grapheme,
so there are cases where the two letters must be treated as distinct and not
as a single letter. For this reason, Dutch will need a distinct "ij" letter,
coded as a single character, and with its own capitalization rules (the
uppercase or titlecase form of "ij" will be the single letter "IJ", not two
letters and not "Ij"; also there exists cases where diacritics can be added
on top of the "ij" letter, which is then more tied as a single letter than a
This distinction is also often made visible in the typography (where the
single letter "ij" digraph is shown with the leg of the "j" kerned deeply
below (and sometimes to the left of) the leading "i", unlike cases where
they are treated as two letters where no kerning occurs (the 'i' is shown
completely on the left of the bottom-left leg of 'j'), and it is even more
evident in the uppercase style (where there will even be the standard small
distance between I and J glyphs when they are two distinct letters, but
where the uppercase I may be drawn in the middle of the left leg of J).
Note the very near ressemblance of the "ij" signel letter with a y with a
diaeresis (so you'll find also Dutch texts that use y with diaeresis instead
of the correct "ij" letter, notably in texts coded with legacy charsets).
This distinction is also preserved for uppercase, where the missing "IJ"
single letter appears encoded with Y with diaeresis...
These cases in Dutch where there's a distinction between the single letter
digraph and two letters are rare, so it is often acceptable to encode the
digraph with two letters, without creating linguistic ambiguities (in most
cases...), or with y with diaeresis/umlaut (which otherwise is not a letter
used in Dutch).
For me, your allusion to legacy charsets is about the deprecating use of y
with diaeresis, not about the use of a distinct "IJ" letter which is needed
for Dutch and should be treated as distinct from the "I then J" letters
This archive was generated by hypermail 2.1.5 : Sat Nov 27 2004 - 15:14:53 CST