Re: Awful Unicode character names (was Re: I-Ching Hexagrams)

From: Kenneth Whistler (
Date: Fri Apr 11 2003 - 17:30:10 EDT

  • Next message: Michael Everson: "Re: Awful Unicode character names (was Re: I-Ching Hexagrams)"

    Pim Blokland wrote:

    > Another candidate for for the Awful Character names would be U+0192.
    > Its formal designation in the database is Latin letter small f with
    > hook.
    > This is OK. However, this poor little overworked character has two
    > extra jobs:
    > 1) It doubles as the guilder (florin) sign. In fact, this use is
    > more widespread than the f with hook.
    > Were we to accept this use, it wouldn't just be the name that would
    > have to be changed; the general category and bidirectional category
    > would change as well. And of course it wouldn't have an uppercase
    > equivalent mapping.
    > Even the appearance would change: while the f with hook looks like,
    > well, an roman f with a left hook, the guilder looks like an italic
    > f. It's not the same character at all!
    > Suggested info for the UnicodeData database:
    > 0192;GUILDER SIGN;Sc;0;ET;;;;;N;;;;;

    As Michael points out, this has long been known, and people
    have been living with the ambiguity for years. But if an
    unambiguous florin/guilder currency sign is desired, it
    is a matter of developing the proposal summary form and
    championing it to convince the committees to encode another
    currency sign.

    Note, however, that the characters in legacy character sets
    to which U+0192 maps have themselves been used ambiguously.
    So this is similar, in some regards, to such famously overloaded
    ASCII characters as U+007E TILDE or U+0027 APOSTROPHE.

    > 2) The SGML definition calls it "function of". HTML entity name:
    > ƒ. It now suddenly is a mathematical character.
    > In this case too, not just the name should be changed, lots of other
    > categories as well. And the appearance: to differentiate from a
    > normal f, function of is often written as a script-f. (The irony is

    Nope. This one is off the table:

    1D453;MATHEMATICAL ITALIC SMALL F;Ll;0;L;<font> 0066;;;;N;;;;;
    1D4BB;MATHEMATICAL SCRIPT SMALL F;Ll;0;L;<font> 0066;;;;N;;;;;

    And it is the mathematical *italic* form which is appropriate
    for functions (if distinguished from the normal f, U+0066),
    rather than the mathematical *script* form.

    > that this is exactly how it was called in Unicode 1.0.)
    > Suggested info in the UnicodeData database:
    > 0192;FUNCTION OF;Sm;0;ON;;;;;Y;;;;;
                       ^^ ^^ ^
    See above. I would disagree about all of these property assignments.

    > You know what? It won't work. Can't cram three different characters
    > into the same codepoint. Forget it...

    It's been done before. ;-)


    This archive was generated by hypermail 2.1.5 : Fri Apr 11 2003 - 18:28:43 EDT