Re: Hexadecimal

From: Kenneth Whistler (
Date: Fri Aug 15 2003 - 15:10:43 EDT

  • Next message: Jim Allan: "Re: Hexadecimal"

    Jull Ramonsky asked:

    > Thoughts anyone?

    Well, yes...

    > If the semantic difference between (for example) uppercase D and
    > mathemematical bold uppercase D was considered sufficiently great so as to
    > require a new codepoint, then I am tempted to wonder if the same might be
    > considered true of hexadecimal digits.


    > So far as I can see, every
    > single character in the "3AD29" string should be in general category N*
    > (either Nd or Nl).

    No. Doing so would trash other processing. And it would force the
    disunification you are suggesting, which would have not actually
    have the effect of helping anyone process these hexadecimal strings,
    but would instead break all existing implementations of them.

    > Sure, you can tell them apart by context, in most circumstances, in the same
    > way that you can tell the difference between a hyphen and a minus sign by
    > context, but since the meanings are so clearly distinct, I wonder if there
    > is a case for distinguishing hex digits from letters without requiring
    > context.

    Sure there is a case for it, but not for breaking the existing
    encoding to do so.

    > I notice that there are Unicode properties "Hex_Digit" and "ASCII_Hex_Digit"
    > which some Unicode characters possess. I may have missed it, but what I
    > don't see in the charts is a mapping from characters having these property
    > to the digit value that they represent.

    There isn't. Any more than there is a chart showing all the numeric
    values that Greek letters have, or all the numeric values that Hebrew
    letters have, or all the numeric values that Runic letters have, ...

    > Is it assumed that the number of
    > characters having the "Hex_Digit" properties is so small that implementation
    > is trivial?


    > That everyone knows it?


    > Or have I just missed the mapping by
    > looking in the wrong place?


    Basically, thousands of implementations, for decades now,
    have been using ASCII 0x30..0x39, 0x41..0x46, 0x61..0x66 to
    implement hexadecimal numbers. That is also specified in
    more than a few programming language standards and other
    standards. Those characters map to Unicode U+0030..U+0039,
    U+0041..U+0046, U+0061..U+0066.

    Disrupting that would be a case of breaking something which
    is working -- even if it would have been more ideal if the
    Latin script and mathematics had had a hexadecimal digit
    system in the first place and not had to borrow Latin letters
    to express numbers with radix > 10.


    This archive was generated by hypermail 2.1.5 : Fri Aug 15 2003 - 15:53:19 EDT