Re: combining: half, double, triple et cetera ad infinitum

From: QSJN 4 UKR <>
Date: Mon, 6 Feb 2012 12:28:49 +0200

> 2011/11/14 Philippe Verdy <>:
>> And arguably, I have also wanted this since long, instead of the hacks
>> introduced by the so called "double" diacritics and "half" diacritics
>> that break the character identity of those diacritics and also
>> introduce encoding ambiguities.
>> In fact, those things would have been encoded since long if Unicode
>> and ISO 10646 had extended their character model to cover a broader
>> range of "structured character clusters".
>> Two format characters (with combining class 0 for the purpose of
>> normalizations) would have been enough for most applications:
>> And then you would have encoded the standard diacritics after the
>> sequence enclosed by these characters, for example cartouches (using
>> an enclosing diacritic).
>> A third format control would have been used as well to specify that
>> two clusters (simple letters or letters with simple diacritics, and
>> including extended clusters) would stack vertically instead of
>> horizontally. With this third one, the basic structure would be
>> encodable really as plain-text.
>> Yes this would have not worked with today's OpenType specifications,
>> but this would have been the place for extending those specifications
>> and not something blocking the encoding process. i am still convinced
>> that this should not be part of an "upper-layer standard', which is
>> not interoperable, and complicates the integration of those
>> pseudo-encoded texts.
>> Once the structure is encoded as such, there is still the possibility
>> to create a linear graphical representation as a reasonnable readable
>> fallback exhibiting the structure unambiguously, even if the text
>> renderer cannot produce the 2D layout (you just need to make those
>> format controls visible by themselves with a glyph, or some other
>> meaning offered in the text renderer, including with colors or various
>> style effects).

We don't need new special characters nor new half-characters nor new
ccc as I proposed above. No!
We already have the Annotation Characters!
It is possible to use something like U+FFF9 ANNOTATION ANCHOR РКГ
TERMINATOR for Cyrrilic number 123 (РКГ under titlo). This way also
titlos wit supralinear leters (like SLOVO TITLO, TVERDO TITLO, see are implementable.
The only question is right processing of annotation chunkes that start
with nonstarter. I mean a being a combining character, without a base
character, chunk of multiline annotation should use previous chunk as
base (in best application).
Received on Mon Feb 06 2012 - 04:35:52 CST

This archive was generated by hypermail 2.2.0 : Mon Feb 06 2012 - 04:35:55 CST