From: Philippe Verdy (verdy_p@wanadoo.fr)
Date: Wed Aug 06 2003 - 10:26:42 EDT
On Wednesday, August 06, 2003 12:38 PM, Kent Karlsson <kentk@cs.chalmers.se> wrote:
> Since I think <a, ring above, cgj, dot below> should be canonically
> equivalent to <a, dot below, cgj, ring above>, but cannot be made
> so (now), the only ways out seem to be to either formally deprecate
> CGJ, or at least confine it to very specific uses. Other occurrences
> would not be ill-formed or illegal, but would then be non-conforming.
There's a way to specify that <A, RingAbove, CGJ, DotBelow> is
well-formed, but not <A, DotBelow, CGJ, RingAbove>:
a CGJ can be authorized in a combining sequence only if it
precedes a base character, or is precedes a combining character
which combining class is strictly lower than the combining class
of the previous character.
So, with this definition, with the combining classes indicated:
- <A=0, RingAbove=230, CGJ=0, DotBelow=220>
is well-formed because 220 < 230. It is distinct from:
<A=0, RingAbove=230, DotBelow=220>, whose canonical
ordering is
<A=0, DotBelow=220, RingAbove=230>
- <A=0, DotBelow=220, CGJ=0, RingAbove=230>
is ill-formed because 230 > 220. The CGJ is superfluous
and should be removed to create:
<A=0, DotBelow=220, RingAbove=230>
- <A=0, DotBelow=220, CGJ=0, Cedilla=220>
is ill-formed because 220 = 220. The CGJ is superfluous
and should be removed to create:
<A=0, DotBelow=220, Cedilla=220>
which is well-formed and in canonical order.
- <A=0, Cedilla=220, CGJ=0, DotBelow=220>
is ill-formed because 220 = 220. The CGJ is superfluous
and should be removed to create:
<A=0, Cedilla=220, DotBelow=220>
which is well-formed and in canonical order.
This "well-formed" rule would clearly give an exact semantic
for CGJ, used in the middle of a combining sequence as the
only way to bypass the canonical reordering of combining
characters.
This archive was generated by hypermail 2.1.5 : Wed Aug 06 2003 - 11:08:47 EDT