RE: Accented ij ligatures (was: Unicode Public Review Issues update)

From: Kent Karlsson (kentk@cs.chalmers.se)
Date: Wed Jul 02 2003 - 06:20:54 EDT

  • Next message: Kent Karlsson: "RE: Accented ij ligatures (was: Unicode Public Review Issues update)"

    > Believe it or not, the IJ and ij digraphs *were* included for
    > compatibility with an 8-bit legacy character set (ISO 6937).

    6937 is a multibyte encoding (one or two bytes per character).
    There are no combining characters at all in 6937, even though
    there is a common misunderstanding that there are, since the
    lead bytes are (almost) systematically assigned.

    > Whether
    > that automatically means they should have been assigned canonical
    > instead of compatibility decompositions, I don't know.

    I think in this case it is correct that the decomposition is a compatibility
    one. It could have been: none; like for the oe and ae ligatures.
    This is in contrast to the MICRO SIGN which ideally should have had
    a canonical decomposition; but Latin-1 characters got special treatment
    (and ASCII characters have even more special treatment in this regard,
    where some spacing accents are not decomposed at all).

                    /kent k



    This archive was generated by hypermail 2.1.5 : Wed Jul 02 2003 - 07:08:21 EDT