RE: Complex Combining

From: Philippe Verdy (
Date: Thu Nov 27 2003 - 16:41:13 EST

  • Next message: Philippe Verdy: "RE: Compression through normalization"

    Peter Kirk writes:
    > This is all rather interesting speculation.
    Yes but it is not illegal to use these conventions as it is still Unicode
    It just happens that Unicode does not define precisely the semantic of such
    composed text, using ZWJ as an unspecified ligature opportunity, but not
    saying anything about how these ligatures are effectively rendered or

    > Now I am sure it could be argued that some of these are not plain text
    > and so should be dealt with by higher level markup. But maybe some of
    > these need to be considered as part of plain text; for example, it is at
    > least conceivable, and arguably true of the Egyptian cartouche, that
    > these marks are required for proper understanding of the plain text,
    > just as much so as regular letters and combining marks.

    I thought that, for the case of the Hieroglyphic cartouche, a pair of
    parenthese-like characters where to be used as punctuation. It just happens
    that the opening cartouche and closing cartouche are ligated together with
    their content.

    Same thing for the upper thick line above characters used to denote colored
    text, where it denotes more than just emphasis: they are best views for me
    abstract parenthese-like punctuation pairs...

    > So how should they be represented? Philippe's suggestion of <c1, mark,
    > c2, mark, c3, mark... mark, cn> would seem to work, but could be very
    > inefficient. Jill's alternative <bracket1, c1, c2, c3... cn, bracket2,
    > mark> is more efficient for long sequences. But perhaps better would be
    > to have paired opening and closing marks: <mark1, c1, c2, c3... cn,
    > mark2> - although this requires a new pair of characters for each
    > such case.

    Interesting proposal, if these two OPEN/CLOSE brackets characters are not
    just format control characters but seens as invisible punctuations (similar
    to parentheses), with combining class 0. It would allow encoding any
    abstract containment model for diacritics and ligatures.

    In that case, to create the circled number 3.1415, we would encode:
            <open invisible bracket>3.1415<close invisible bracket><combining
    enclosing circle>
    There would be of course no requirement for renderers to use actually a
    circle, as a rounded box would be easier to draw in a layout. This would
    also allow representing efficiently the Hieroglyphic cartouche.

    << ella for Spam Control >> has removed Spam messages and set aside
    Newsletters for me
    You can use it too - and it's FREE!

    This archive was generated by hypermail 2.1.5 : Thu Nov 27 2003 - 17:23:25 EST