From: Peter Kirk (email@example.com)
Date: Tue Jul 29 2003 - 09:11:25 EDT
On 28/07/2003 19:05, Kenneth Whistler wrote:
>This is, of course, precisely the desired result -- the CGJ is
>ignored for weighting, but its presence prevents the reordering
>of the vowels into the undesired sequence by normalization.
>And the resultant weighted key weights the vowels in the correct
>Tailoring of the collation table could modify any of this, but
>the above example is what you get just using the default table.
>But it is important that people implementing searching and sorting
>for Hebrew understand why and how the CGJ is "ignored" in this
>context, in order to get correct results. For example, if you
>strip the CGJ and *then* hand the string to the collation weighting
>algorithm, normalization will again rearrange the points into
>the wrong order for weighting.
Thank you, Ken. In this particular case we might want to tailor the
collation table so that this CGJ is effectively ignored. But I don't
understand this aspect of Unicode well enough to know exactly what can
-- Peter Kirk firstname.lastname@example.org http://web.onetel.net.uk/~peterkirk/
This archive was generated by hypermail 2.1.5 : Tue Jul 29 2003 - 09:46:20 EDT