Re: Misuse of encoded characters (was: "Re: Creative commons' license symbols")

From: Kenneth Whistler (
Date: Mon Nov 27 2006 - 14:47:38 CST

  • Next message: Martin Duerst: "Re: Unicode conference papers"

    Antonia Tuvalkin said:

    > On 2006.11.22, 22:31, Kenneth Whistler <> wrote:
    > > If I chose to take the Creative Commons Noncommercial symbol (the
    > > backslashed circle over a dollar sign), which they use very explicitly
    > > in particular licenses, and hijacked it to start advertising links to a
    > > tax protest site, or to a bunch of anarchists advocating the bombing of
    > > banks, I'd probably be hearing shortly from a CC lawyer wanting to deal
    > > with misappropriation of their IP rights
    > <...>
    > > Symbols for encoding as characters in Unicode cannot be encumbered with
    > > some particular group's claim to control their exact shape, appearance,
    > > meaning, function, and usage rights.
    > How does this hinder the chances for encoding?

    I think Doug Ewell said most of what needs saying on that topic in
    the original thread.

    > After all, the said
    > misusers instead of waiting for the said symbol to be added to Unicode
    > could just start using today U+0024 U+20E0 instead. Then what? How would
    > this be different than a new, sigle character?

    Enclosing combining marks have a problematical status in the
    standard. This has been a difficult-enough issue that the Unicode 5.0
    text added more cautions specific to marks like the combining
    enclosing circle:

      "Users should be cautious when applying combining enclosing marks
       to other than freestanding symbols--for example, when using a
       combining enclosing circle to apply to a letter or a digit. Most
       implementations assume that application of any nonspacing mark will
       not change the character properties of a base character. This
       means that even though the intent might be to create a circled
       symbol (General_Category=So), most software will continue to
       treat the base character as an alphabetic letter or a numeric
       digit. Note that there is no <i>canonical</i> equivalence
       between a symbolic character such as U+24B6 CIRCLED LATIN
       CAPITAL LETTER A and the sequence <U+0041 LATIN CAPITAL LETTER A,
       U+20DD COMBINING ENCLOSING CIRCLE>, partly because of this
       difference in treatment of properties." -- TUS 5.0, p. 258
    Applying U+20E0 graphically to U+0024 DOLLAR SIGN would have this
    issue, because the dollar sign would still be seen by software
    as a currency sign, in most cases, rather than the resulting
    displayable graphic form being treated as a unitary symbol.
    Note also that a number of circled characters have been added to
    the standard, despite the availability of U+20DD.

    The ability to represent the Creative Commons Noncommercial symbol
    now with <U+0024, U+20E0> is thus neither here nor there as
    regards whether a proposal to encode it as a unitary symbol
    (and with General_Category=So) in the standard would succeed,
    any more than the availability of <U+0030, U+20DD> prevented the
    encoding of U+24EA CIRCLED DIGIT ZERO.

    > Hey, if someone misuses U+FDF2 I bet s/he'll have to deal with very angry
    > people. Not to mention U+271D and many such symbols. But how does that
    > affects the standard?

    Of course. Any religious symbol, particularly, is easily subject to
    any number of misuses and abuses that would rile significant
    numbers of folks. For that matter, you could just spell out
    "God" in ASCII characters and find a way to put it into a context
    that would make people furious at you.

    But that is utterly beside the point, frankly.

    There is no organization claiming it created the symbol U+271D for
    a specific purpose, with a website demonstrating the exact
    form and context they expect it to be used in, and claiming to
    license its usage, under legal restrictions, for certain

    It is symbols that come in from sources like the latter that
    come pre-encumbered with expectations and limitations that
    make the UTC (and WG2 for that matter) leery of them and desirous of
    demonstration of widespread general usage before encoding
    them as characters.


    This archive was generated by hypermail 2.1.5 : Mon Nov 27 2006 - 14:50:41 CST