From: Jill Ramonsky (Jill.Ramonsky@Aculab.com)
Date: Mon Nov 10 2003 - 10:49:00 EST

• Next message: Jill Ramonsky: "RE: Hexadecimal digits?"

Sorry, but I have to correct you. You state below that "[my] argument
doesn't work". This is slightly confusing because I haven't proposed any
arguments, beyond that I support the inclusion into Unicode of hex
digits which are distinct from the letters A to Z.

I can only assume you are suggesting that the "natural sort" algorithm
works in decimal but not in hex. If so, I should mention that (1) it
wasn't me who invented the natural sort algorithm, so I can't take
credit for that in /any/ radix, and (2) there is absolutely no reason
why it wouldn't work in radix sixteen just as it would in radix ten. For
example 77 and 100 get sorted in the order (77, 100), not (100, 77) in
EVERY radix in which the digits 0, 1 and 7 exist. This is true in base
eight. It is true in base ten. It is true in base sixteen. It is even
true in base 68431. In fact, the only things the natural sort algorithim
need to know are (a) which characters represent digits and which ones
don't, and (b) what is the numerical value of each such digit.

To clarify this, if we applied the natural sort algorithm to the
filenames thus far mentioned, using a natural sort algorithm which
recognised the codepoints U+218A to U+218F as having the digit property
with values ten to fifteen respectively, then they would sort as
follows. (The algorithm wouldn't specifically need to know that the
digits were "hex", only that they were numeric).

(1) U+46, U+69, U+6C, u+65, U+39, U+39 ("File99")
(2) U+46, U+69, U+6C, u+65, U+39, U+39, U+41 ("File99A") -- note that
this A is a letter
(3) U+46, U+69, U+6C, u+65, U+39, U+39, U+46 ("File99A") -- note that
this F is a letter
(4) U+46, U+69, U+6C, u+65, U+31, U+30, U+30 ("File100")
(5) U+46, U+69, U+6C, u+65, U+39, U+39, U+32 ("File992")
(6) U+46, U+69, U+6C, u+65, U+39, U+39, U+218A ("File99A") -- note that
this A is not a letter, but the digit ten
(7) U+46, U+69, U+6C, u+65, U+39, U+39, U+218F ("File99F") -- note that
this F is not a letter, but the digit fifteen

If you are suggesting that the natural sort algorithm won't work
/without/ separate codepoints for hex digits then you are of course
correct, but that is an argument in favor of hex-digit-characters, not
against them.

Jill

> -----Original Message-----
> From: Kent Karlsson [mailto:kentk@cs.chalmers.se]
> Sent: Monday, November 10, 2003 2:42 PM
> To: unicode@unicode.org
>
>
> > After all, 99A (in hexadecimal) is greater than 99 (hexadecimal).
>
> Oops. I missed the "2" key. E.g:
>
> > After all, 99A (in hexadecimal) is greater than 992 (hexadecimal).
>
> Sorry (both about missing the "2" and that your argument doesn't
> work)
> /kent k
>

This archive was generated by hypermail 2.1.5 : Mon Nov 10 2003 - 11:34:07 EST