Re: character "combinability"

From: spir (denis.spir@free.fr)
Date: Fri Feb 19 2010 - 04:04:50 CST

  • Next message: spir: "Re: Greek chars encoded twice -- why?"

    On Thu, 18 Feb 2010 15:34:24 +0100
    Kent Karlsson <kent.karlsson14@comhem.se> wrote:

    >
    > Den 2010-02-18 15.15, skrev "spir" <denis.spir@free.fr>:
    >
    > > Hello,
    > >
    > > Does Unicode specify which characters, especially bases (*), are allowed for
    > > combination (into a combining sequence)? For instance, from the ASCII subset,
    > > it seems to me only letters can occur in a combination --except for the
    > > special case of CR-LF. But I could not find any such restriction list. There
    > > may be two cases, imo:
    >
    > CR-LF is not a combining sequence.

    Wrote too fast. Read "grapheme cluster" instead. (I first thought a "combining sequence" is a representation of a "grapheme cluster", more or less like a "code" is a representation of a "character".)

    > But talking about combining sequences:
    >
    > > -1- Either Unicode does not impose any restriction on combination. But then we
    > > can and are allowed to concretely encode characters (or rather grapheme) that
    > > have no attested existence in real use: for instance, (ASTERISK, COMBINING
    > > CIRCUMFLEX). This seems to me contradictory with unicode guidelines, I guess.
    > > But opens the door to creative use of unicode ;-)
    >
    > You are free to combine away. Not all will render properly, but that is a
    > property of the font+rendering engine.

    Right, this is what I needed to know.

    > > Or is there a kind of implicit gentleman's agreement; meaning combinations
    > > should be used in a sensible manner?
    >
    > You could say that.

    All right, thank you.

    > /kent k

    Denis

    ________________________________

    la vita e estrany

    http://spir.wikidot.com/



    This archive was generated by hypermail 2.1.5 : Fri Feb 19 2010 - 04:12:26 CST