RE: Saudi-Arabian Copyright sign

From: Asmus Freytag (
Date: Mon Sep 20 2004 - 13:21:19 CDT

  • Next message: Eric Muller: "Re: Saudi-Arabian Copyright sign"

    At 06:09 PM 9/19/2004, D. Starner wrote:
    >Asmus Freytag writes:
    > > Given
    > > the nature of the symbol in question, I would personally see no reason to
    > > object
    > > to encoding it - especially given the current and projected lack of
    > > availability
    > > of other alternatives.
    >It's a simple combining character. Even if you can't do arbitrary circles
    >around characters, you can take one character sequence and map it to the
    >glyph in a font. Systems that can't do even that need to be fixed.

    In other words, you would like to treat this as a mandatory ligature.

    To make this work in interchange, we need to get the buy-in from enough
    platform, application and font vendors that they want to support this and
    similar characters in that way (and fix their products where necessary).

    If we can get that kind of buy-in, then we could add this and other special
    purpose circled characters via the new "named sequences".

    Lacking such buy-in, the addition of these as characters becomes more

    The problem here is that we have a proven track record that implementers
    *have* supported additions to the character repertoire by expanding their
    fonts. We do *not* have a proven track record of implementers widely
    supporting special layout features, other than the core requirements for
    a given script.

    However, since we don't want to continue to encode accented characters
    because of normalization, we are adding named sequences to the standard,
    so that users can identify required sequences of characters and accents
    by referring to sequence identifiers. Your suggestion logically implies
    the extension of that process.

    In the case of symbols like copyright, we do have a precedent of encoding
    these outright and to *not* normalize them. (C) and circled C are not
    identical as Unicode stands today. This is similar to currency symbols.

    However, unlike currency symbols, which are in extremely common use on
    all sorts of embedded platforms, and where therefore single character
    codes can be an advantage, most of the symbols like (Wz) etc. are much
    less widely used and could indeed be handled as above (and recognized
    as named sequences).

     From the perspective of the users, this solution would be more appealing
    if we had the buy-in from major vendors to actively support this approach.


    PS for named sequences:
    Draft Data:
    (the last part of the file name may change to NamedSequences*.txt).


    This archive was generated by hypermail 2.1.5 : Mon Sep 20 2004 - 13:23:51 CDT