RE: Coloured diacritics (Was: Transcoding Tamil in the presence of markup)

From: Philippe Verdy (verdy_p@wanadoo.fr)
Date: Sun Dec 07 2003 - 15:10:16 EST

  • Next message: Peter Jacobi: "Re: Transcoding Tamil in the presence of markup"

    Peter Kirk writes:
    > A very tentative suggestion for some glue: a character which can take
    > combining marks but whose function is to throw those marks back on to
    > the preceding base character, preceding any markup. This would have to
    > be a zero width base character, not a format character or a combining
    > mark to avoid defective sequences; it would also have to be default
    > ignorable. Or maybe there is an existing character which can be reused
    > e.g. WJ (same counterarguments as recently on the bidi list of course).
    > A very simple implementation could use the backspacing trick Philippe
    > used to position the diacritics roughly. A more adequate one would be a
    > difficult problem for font developers, but one that is soluble in
    > principle.

    The glue seems good in apparence but much too complex to implement in
    Unicode. I do think that specific occurences of compelx styles must be
    handled with a stylesheet, where any given grapheme cluster is applied a
    composite style as a whole.

    So my best choice, to represent such a thing like a i with a distinct
    style for the i would be something like:
            <span class="styled-i">i</span>
    and then leave all the complex styling of this "i" be enterily specified
    by the "styled-i" class defined in the stylesheet.

    The stylesheet could then completely ignore what is really encoded in
    the middle of the <span> element, and could work completely with
    glyphic specifications, which could for example take the dotless i and
    a dot above diacritic glyphs to rebuild this character.

    However, if a document needs many occurences of such style, it would be
    interesting to be able to define a style to apply to individual characters
    or sequences of characters that occur anywhere in a classed text, so that
    you wont need to write:
            <span
            class="styled-i">i</span>n<span
            class="styled-i">i</span>t<span
            class="styled-i">a</span>t<span
            class="styled-i">i</span>e
    but instead just:
            <span class="myStyle">initiative</span>
    And in the stylesheet, the selector would be something like:
            .myStyle#"i" {
                    compose-mode: replace;
                    compose-as: above(.dotless-i, .red-dot-above);
            }
            .dotless-i {
                    insert-text: "&dotless-i;";
            }
            .red-dot-above {
                    insert-text: "&dot-above;";
                    color: red;
            }

    Of course such language does not exist for now in CSS, but it would
    allow a clear separation of text and style, while not limiting the
    kind of style one would like to do (including rendering some text
    with images or objects and active scripts).

    But all this is out of scope of Unicode. Such things should be
    requested to the W3C as an enhancement proposal in CSS to allow
    styling only small fragments of text found in text elements
    (with fragments initially specified as characters or strings of
    characters, or possibly as regular expressions to match items
    for which complex style is needed)...

    __________________________________________________________________
    << ella for Spam Control >> has removed Spam messages and set aside
    Newsletters for me
    You can use it too - and it's FREE! http://www.ellaforspam.com





    This archive was generated by hypermail 2.1.5 : Sun Dec 07 2003 - 15:58:32 EST