Re: Just if and where is the then?

From: Philippe Verdy (verdy_p@wanadoo.fr)
Date: Tue May 04 2004 - 17:56:24 CDT


From: "African Oracle" <oracle@africaservice.com>
> "The existing composites were included only out of necessity so that new
> Unicode implementations could interoperate with existing implementations
> using legacy industry-standard encodings." - Peter Constable
>
> Are we saying we have exhausted such necessity?
>
> And what are these legacy-standard encodings?

I think this is the list shown in the "References" section of the Unicode
standard.
I don't think that this list is closed: there may be further standards
considered, notably if they reach an ISO standard status, or they start being
used extensively in some popular OS as a de-facto standard.

> "No new composite values will be added". - Peter Constable
> The above sounds dictatorial in nature.

I think that the sentence is incomplete, or you interpret it the wrong way. The
key is the "composite" term, which here should mean a character that has a
canonical decomposition into a sequence of a base character with combining class
0, and one or more diacritics with a positive combining class.

However this is a general principle that applies to already encoded scripts that
are already widely used (notably Latin, Greek, Cyrillic, Hiragana/Katakana with
voicing marks, Han with tone marks, pointed Hebrew or Arabic, and Brahmic
scripts), but which may not apply to newly encoded scripts if they offer some
new combining diacritics and new base letters, where some compositions may be
desirable immediately due to difficulties to render the composite properly.
Some semitic scripts for example have so complex rules to create composites with
a base consonnant and combining vowel modifiers, that the whole script was
instead encoded as if it was a syllabary... (Here I think about Ethiopic, but
some have different opinions and argue that Ethiopic is a true syllabary, given
its current modern usage).



This archive was generated by hypermail 2.1.5 : Fri May 07 2004 - 18:45:25 CDT