From: John Hudson (john@tiro.ca)
Date: Thu May 18 2006 - 12:17:59 CDT
theiling@absint.com wrote:
> While programming a compatibility decomposition plus case folding (two
> things in one step), I noticed that
> 
>     U+0345  COMBINING GREEK YPOGEGRAMMENI
> is converted to
>     U+03B9  GREEK SMALL LETTER IOTA
> 
> but that code positions like
> 
>     U+0363  COMBINING LATIN SMALL LETTER A
> is not converted to
>     U+0061  LATIN SMALL LETTER A
> 
> And some similar combining chars accordingly.
The former is a peculiarity of the Greek writing system, not a general rule for combining 
letter-like marks. The ypogegrammeni is written as a full iota when it follows an 
uppercase letter.
> Is there a reason for it?  This would then result in some letter-like
> chars not being found when searching for them as a letter.
But they are not letters, they are combining marks that happen to be based on letters: 
their function is not alphabetical. In general, you don't want them to be confuseable with 
or decomposed to letter characters. The Greek ypogegrameni is an exception, and I believe 
there are case roundtripping issues as a result.
John Hudson
-- Tiro Typeworks www.tiro.com Vancouver, BC john@tiro.ca I am not yet so lost in lexicography, as to forget that words are the daughters of earth, and that things are the sons of heaven. - Samuel Johnson
This archive was generated by hypermail 2.1.5 : Thu May 18 2006 - 12:37:15 CDT