RE: Hexadecimal digits?

From: Kenneth Whistler (
Date: Mon Nov 10 2003 - 16:22:54 EST

  • Next message: John Cowan: "Re: Ciphers (Was: Berber/Tifinagh)"

    Jill Ramonsky asked:

    > My question went unanswered, so I'll ask it again - do I get a vote?

    Well, Philippe did try to answer the question, but if you need it again,
    then the ultimate answer is no.

    By the way, in case people have not noticed, this is a WG2 document,
    not a UTC document, submitted by Ricardo Cancho Niemietz. Decisions
    taken by WG2 are done on the basis of the JTC1 voting procedures,
    and in this case the voting parties are the national body standards
    organizations participating in ISO/IEC JTC1/SC2. In other words, this
    would be ANSI (U.S.), DIN (Germany), JSA (Japan), AFNOR (France),
    and so on.

    Of course, any proposal for adding characters to 10646 ultimately
    is also a proposal to add them to the Unicode Standard. And when
    such a proposal would come around to the Unicode Technical Committee
    for consideration, it would be the full members of the Unicode
    Consortium ultimately that would be voting on it in that context.

    > How does one go about registering support for a proposal?

    Expert contributions are accepted both by WG2 and by the UTC, and
    such contributions may consist of technical argumentation in favor
    or in opposition to some proposal to encode characters, as in this

    > I consider
    > myself a relevant interested party, someone who belives that hex FFFF
    > should collate before hex 10000 in a "natural sort". Is it possible to
    > add support to a proposal, or do I just have to sit here and wait while
    > the Consortium reject it

    As Mark Davis indicated, it is highly likely that the Unicode
    Technical Committee would reject the proposal anyway, if it ever
    got that far. These kinds of proposals for hexadecimal digits
    have been floating around for years. As far as I can tell, they
    have absolutely no traction among the members of the UTC.

    > Is there anything I can do to encourage the
    > adoption of this proposal. (Or in general, is there anything that any of
    > us can do to encourage the adoption of a proposal which we support?)

    If you are feeling disenfranchised, you can always participate
    by the means of (written) expert contributions to the relevant
    standards organizations.

    > Jill

    > I really hope you're not planning on referencing the entire earlier
    > discussion from the archives one email at a time. That would be SO
    > tedious. I was merely pointing out that we've been over all this before.

    Note that I'm not going to bother to restate my own prior arguments
    against this. As Jill indicates, it's all in the archives if people
    want to dig it out again.

    However, Jill also stated:

    > The arguments made in Ricardo's document are clear, concise, and
    > absolutely spot on. He puts things better that I ever did, so I will let
    > his document stand as my argument.

    I concur that Ricardo stated his case more completely and clearly
    than anybody I have seen before on this topic. And he made a
    sterling effort to play the game by the rules, addressing
    specifically the disunification questions posed by the WG2
    proposal summary form and principles and procedures. He clearly
    also made a serious effort to read and understand the relevant
    portions of the Unicode Standard.

    But, ...

    Ricardo's arguments are for adding presentation forms of 'A'..'F'
    for the purpose of guaranteeing fixed-width presentation of
    hexadecimal digit strings in publishing and other contexts,
    using monofonts:

    "But the proposed characters are not intended to be used while
    writting [sic] program code (as source code is ever plain text),
    but to be used in all kinds of documentation, on paper or on screen
    output, where hexadecimal values often occurr [sic]--just like
    Unicode books. Then, the proposed hexadecimal digits ten to fifteen
    acts as a kind of specialized presentaiton forms of the latin
    capital letters A to F for purposes of desktop publishing
    of rich text."

    As others have pointed out, however, not all fonts used fixed
    pitch for digits, so the argument fails on its face. *Decimal*
    digit strings do not always line up. So while the problem of
    formatting of hexadecimal strings with proportional fonts is
    clearly more obvious than decimal strings, ultimately it is
    the same issue. And the solution is also a traditional one --
    simply pick a monospace font for display.

    In other words, the presentation forms that Ricardo wants are
    already present -- they are the glyphs for 'A'..'F' in monospace
    fonts. And since Ricardo is asking for this to address
    presentation in *rich* text formats, rather than in plain
    text, the existing mechanisms for rich text, including fonted
    text, can already address the problem for "desktop publishing of
    rich text."

    Some of Ricardo's examples are, implicitly, of plain text,
    such as file name listings. But in such contexts you run afoul
    of the fundamental problem addressed by Jim Allan:

    "Using symbols that the computer automatically distinguishes while human
    beings do not is a *dangerous* solution to any problem."

    If, after decades of practice to the contrary, we tried to
    implement a *coding* distinction between 'A'..'F' used as
    letters and used a hexadecimal digits, we would just introduce
    all kinds of mischief into processing of plain text. Use of
    the new hexadecimal digits would be illegal in the syntax
    of almost any formal language, and the only way to address the
    problem of the coexistence of previous data and data making
    use of the new digits would be to introduce normalization to
    level the distinctions -- which would vitiate the reasons for
    having made the coding distinction in the first place.

    I really don't see any way around that kind of problem that
    any proposal for hexadecimal digits would provide for a
    character encoding, and that is the shared opinion in the
    UTC which blocks such proposals from getting off the ground


    This archive was generated by hypermail 2.1.5 : Mon Nov 10 2003 - 17:11:37 EST