From: Peter Kirk (peter.r.kirk@ntlworld.com)
Date: Thu Jul 31 2003 - 18:32:46 EDT
On 31/07/2003 15:02, Ted Hopp wrote:
>On Thursday, July 31, 2003 4:56 PM, John Cowan wrote:
>
>
>>Unicode allows any combining character to be attached to any base
>>
>>
>character
>
>
>>whatsoever. However, putting a dagesh into a DEVANAGARI KA, or placing a
>>circumflex over an ARABIC MEEM, is pretty certain to cause bad rendering,
>>
>>
>and
>
>
>>may screw up other text processes such as syllabication.
>>
>>
>
>>From Unicode 3.2, Chapter 8 [regarding shin and sin dot]:
>"The two dots are mutually exclusive. The base letter shin can also have
>dagesh, a vowel, and other diacritics. Use of the two dots with any other
>base character is an error."
>
>Sometimes, doing something that's allowed can still be an error.
>
>
Presumably we have to distinguish between what is a spelling etc error
in any particular language and what is an illegal Unicode sequence.
Probably this sentence really means more like a spelling error.
We mustn't forget that unusual combinations are sometimes meaningful.
For example, there are languages which use Hebrew base characters with
Arabic vowel points. We mustn't make these illegal sequences in Unicode
without very good reason.
-- Peter Kirk peter.r.kirk@ntlworld.com http://web.onetel.net.uk/~peterkirk/
This archive was generated by hypermail 2.1.5 : Thu Jul 31 2003 - 19:15:22 EDT