From: John Cowan (email@example.com)
Date: Tue Jul 08 2003 - 11:14:45 EDT
Peter Kirk scripsit:
> The solution for this sequence is as follows: Define a new combining
> character something like HEBREW LIGATURE PATAH HIRIQ with a canonical
> decomposition of hiriq - patah (yes, that way round) and a glyph with a
> hiriq to the left of a patah. How does this help? Well, it will not
> affect users who type patah then hiriq, in non-canonical order, into an
> application which does not immediately normalise the text, as the
> renderer will still render hiriq to left of patah as typed. But when
> this text is normalised into NFC, the sequence will first be reordered
> as hiriq - patah, and then this combination will be composed into the
> new ligature. That is correct, isn't it?
Such a character could only be encoded if it were put into the list
of composition exceptions, because it would upset the stability of
normalization. The guarantee is that as long as a text contains only
characters that occur in version V of Unicode, all normalizers written to
versions greater than or equal to V will produce the same results on it.
You are creating a situation where patah followed by hiriq will normalize
one way in Unicode 4.0 (since those are 4.0 characters) and another way
in some later version. So what you want is as big a no-no as changing
canonical decomposition, and for exactly the same reason.
-- John Cowan http://www.ccil.org/~cowan firstname.lastname@example.org Be yourself. Especially do not feign a working knowledge of RDF where no such knowledge exists. Neither be cynical about RELAX NG; for in the face of all aridity and disenchantment in the world of markup, James Clark is as perennial as the grass. --DeXiderata, Sean McGrath
This archive was generated by hypermail 2.1.5 : Tue Jul 08 2003 - 12:04:16 EDT