Re: Character Sequences of Uncertain Rendering (was: Version linking?) from Richard Wordingham via Unicode on 2017-08-27 (Unicode Mail List Archive)

From: Richard Wordingham via Unicode <unicode_at_unicode.org>
Date: Mon, 28 Aug 2017 03:40:54 +0100

On Sun, 27 Aug 2017 19:55:31 +0200
Philippe Verdy via Unicode <unicode_at_unicode.org> wrote:

> 2017-08-27 6:06 GMT+02:00 Richard Wordingham via Unicode <
> unicode_at_unicode.org>:

> Canonical reordering is unambiguously refering to the canonical
> equivalences in TUS. These are automated and can occur at any time,
> and the only way to avoid them is to insert joiners. But they should
> never be needed for normal texts, except to split clusters or
> introduce semantic differences where they are relevant (and in that
> case the renderers will also try to distinguish them, otherwise they
> can freely reorder every sequence of diacritics with distinct
> non-zero combining classes and will represent all canonically
> equivlent sequences exactly the same way without distinguishing them).

This wasn't the sort of problem I was talking about. The Indic
example with undefined rendering has two left matras with ccc=0. The
questions was whether they should be displayed from left to right (as in
MS Edge) or right to left (as in Firefox).

The problem of diacritics below having different combining classes has
been raised for minority languages in Thai. There seems a definite
prospect that the rendering order has to depend on the writing system -
and the other order would simply be wrong. Standardisation occurs
outside the purview of the UTC. The order may be forced by CGJ,
which is a joiner in name only when it occurs before combining marks.

Richard.
Received on Sun Aug 27 2017 - 21:41:25 CDT

This archive was generated by hypermail 2.2.0 : Sun Aug 27 2017 - 21:41:26 CDT