From: Kent Karlsson (firstname.lastname@example.org)
Date: Wed Jul 02 2003 - 06:20:54 EDT
> Believe it or not, the IJ and ij digraphs *were* included for
> compatibility with an 8-bit legacy character set (ISO 6937).
6937 is a multibyte encoding (one or two bytes per character).
There are no combining characters at all in 6937, even though
there is a common misunderstanding that there are, since the
lead bytes are (almost) systematically assigned.
> that automatically means they should have been assigned canonical
> instead of compatibility decompositions, I don't know.
I think in this case it is correct that the decomposition is a compatibility
one. It could have been: none; like for the oe and ae ligatures.
This is in contrast to the MICRO SIGN which ideally should have had
a canonical decomposition; but Latin-1 characters got special treatment
(and ASCII characters have even more special treatment in this regard,
where some spacing accents are not decomposed at all).
This archive was generated by hypermail 2.1.5 : Wed Jul 02 2003 - 07:08:21 EDT