RE: Hexadecimal digits?

From: Jim Allan (jallan@smrtytrek.com)
Date: Mon Nov 10 2003 - 14:02:09 EST

  • Next message: John Hudson: "Re: Ciphers (Was: Berber/Tifinagh)"

    Jim Ramonsky posted:

    > I am not the one who has not thought it through. There _is_ no
    > difference between decimal 7 and hex 7. They are the same digit. File777
    > sorts before File999 in _ALL_ radices.

    Exactly.

    So mixed hex and mixed decimal will not sort or compare properly using a
    natural sort *string* comparison even with creation of clones of the
    alpha characters with numeric values.

    Why then use a natural sort at all?

    If you want a natural sort using a mixed alpha and numeric string which
    may use multiple bases, a reasonable procedure might be to use the
    Unicode subscript numbers as base markers.

    Upon reaching one of these the parser evaluates the superscript digits
    to create a decimal number and then goes backward until it comes to the
    first non-digit according to that base identified by that decimal
    number. Then it can simply zero extend for sort or comparison. Or a
    binary value can be used for sort or comparison if required.

    This solves for all bases up to base 36. Such a system would be
    understood on sight by humans.

    Or again, if hex number are the only issue, use some normal
    hex-indication flag in the string so that both humans and the customized
    natural sort will know that the number is hex and where the number
    begins and ends, e.g. File-0x15A-19, File-oxB23A5-25,
    File-ox123ABCD-Extra in which the center portion, between the two
    hyphens, would be recognized as hex by the "0x" prefix.

    Using symbols that the computer automatically distinguishes while human
    beings do not is a *dangerous* solution to any problem. Enough typos are
    made even when symbols are different. It is common in producing random
    uppercase alpha / numeric codes to avoid 0, O, Q, 1, I, 5, S, 8, B, U, V
    for that reason alone.

    Now a completely new set of hex digits, as has been suggested, might
    make sense. But that is not for Unicode to prescribe, but for
    mathematical associations or perhaps some other computer standards
    organization. If such a set of digits were proposed by international
    organizations with very strong backing (comparable to introduction of
    the Euro symbol) then they would certainly have a place in Unicode.

    Or if a particular computer language were to introduce them in the PUA
    for that language and that usage became popular, then again they would
    be encoded by Unicode.

    But one wants to avoid as much as possible symbols that look identical
    to human beings but have radically different meanings. Unicode as enough
    of those by necessity and for backward compatibility.

    Jim Allan



    This archive was generated by hypermail 2.1.5 : Mon Nov 10 2003 - 14:50:37 EST