From: Philippe Verdy (firstname.lastname@example.org)
Date: Thu Nov 27 2003 - 08:00:43 EST
Arcane Jill writes:
> Gotcha. It's all starting to make sense now. Including the opposition to
> Maybe one could make "circled 92" in two stages:
> (1) create a glyph representing 92, then (2)
> apply an enclosing circle modifier to it.
> Except of course, that wouldn't work!
> Because a modifier only affects a single base character.
This is true if the base character is not linked with other preceding
characters by something like ZWJ which creates a ligature opportunity (but
ZWJ offers no guarantee that the ligature or junction will be effectively
applied on rendering, and does not affect the semantic of text, as it is
just a formating control).
> Basically, you'd need to do: encircle( "9" + "2")
> instead of: "9" + encircle("2")
You're right here: the simple concatenation with + is not intended to extend
the semantic of the separate encircle() transformation function.
i.e. if ZWJ was effectively creating a "semantic" ligature:
encircle(<DIGIT NINE, DIGIT TWO>)
~~ encircle(<DIGIT NINE, ZWJ, DIGIT TWO>)
~~ <DIGIT NINE, ZWJ, DIGIT TWO, COMBINING ENCLOSING CIRCLE>
or more consistently (more complicate to implement in a encircle() function,
but probably simpler to parse and render correctly by noting that the two
combining sequences on each side of ZWJ both have a common "encircled"
rendering property, which could then be "factorized" when looking up for the
range of characters to which the enclosing property should be applied):
== <DIGIT NINE, COMBINING ENCLOSING CIRCLE, ZWJ, DIGIT TWO,
COMBINING ENCLOSING CIRCLE>
But I note that this is not the way the character model was defined.
Particularly, we have the case of "double" diacritics, currently coded as
<base letter 1, DOUBLE TILDE, base letter 2>
and not simply as:
<base letter 1, TILDE, ZWJ, base letter 2, TILDE>
as if it was the result of the function:
tilde(<base letter 1> + <base letter 2>)
So for arbitrary encircled numbers, what would be needed is a "DOUBLE
ENCLOSING CIRCLE" diacritic (currently not encoded in Unicode, except with
PUA) like this:
encircle(<DIGIT 9, DIGIT 2>)
== <DIGIT 9, DOUBLE ENCLOSING CIRCLE, DIGIT 2>
Or for arbitrary numbers:
encircle(<DIGIT 9, DIGIT 2, DIGIT 3, DOT, DIGIT 0>)
== <DIGIT 9, DOUBLE ENCLOSING CIRCLE,
DIGIT 2, DOUBLE ENCLOSING CIRCLE,
DIGIT 3, DOUBLE ENCLOSING CIRCLE,
DOT, DOUBLE ENCLOSING CIRCLE,
Here you don't have any ZWJ character, that's the double diacritic which
creates explicitly the ligature between the previous and next base
All these solutions are not specified in the standard. This is a pure
convention of use of Unicode, and until there's some enhancement published
in the Unicode character model, to clearly create ranges of characters on
which diacritics can be applied, without the too simple ZWJ control, this
interpretation of such encoded text will remain application-dependant.
<< ella for Spam Control >> has removed Spam messages and set aside
Newsletters for me
You can use it too - and it's FREE! http://www.ellaforspam.com
This archive was generated by hypermail 2.1.5 : Thu Nov 27 2003 - 09:55:51 EST