Re: Hexadecimal in many scripts

From: Markus Kuhn (Markus.Kuhn@cl.cam.ac.uk)
Date: Fri Jun 04 1999 - 17:03:07 EDT


Doug Ewell wrote on 1999-06-04 18:04 UTC:
> Markus wrote:
>
> > 0x06e1, 0x0410 Cyrillic_A А CYRILLIC CAPITAL LETTER A
> > 0x06e2, 0x0411 Cyrillic_BE Б CYRILLIC CAPITAL LETTER BE
> > 0x06e3, 0x0426 Cyrillic_TSE Ц CYRILLIC CAPITAL LETTER TSE
> > 0x06e4, 0x0414 Cyrillic_DE Д CYRILLIC CAPITAL LETTER DE
> > 0x06e5, 0x0415 Cyrillic_IE Е CYRILLIC CAPITAL LETTER IE
> > 0x06e6, 0x0424 Cyrillic_EF Ф CYRILLIC CAPITAL LETTER EF
>
> Let's ignore for the moment the question of whether non-Roman letters
> ought to be used at all to represent hexadecimal numbers.
>
> These are not the first six letters of the Cyrillic alphabet. The
> letters Markus gave are ordered by what Roman Czyborra calls the
> "KOI correspondence," in which Cyrillic letters are roughly equated
> to their Latin counterparts and then ordered as they would be in the
> Latin alphabet. The letters above correspond to A, B, C, D, E, F by
> sound, but are not what a Russian speaker would call "the first six
> letters."
>
> The real first six letters of the Cyrillic alphabet are:
>
> U+0410 CYRILLIC CAPITAL LETTER A
> U+0411 CYRILLIC CAPITAL LETTER BE
> U+0412 CYRILLIC CAPITAL LETTER VE
> U+0413 CYRILLIC CAPITAL LETTER GHE
> U+0414 CYRILLIC CAPITAL LETTER DE
> U+0415 CYRILLIC CAPITAL LETTER IE

(АБВГДЕ)

Thanks. I guess there might actually be good arguments for using the
KOI order and not the alphabetic order in hex decimal entry on
Cyrillic keyboards, but ISO 14755 explicitly talks about the
first six letters of the alphabet:

  "As there are no digits available beyond 9, the first 6 letters of
  the Latin alphabet (or of any alphabet if the Latin script is not
  used) are used to represent the extra hexadecimal digits."

...

  "The keyboard in use shall have an alphanumeric section. This
  alphanumeric section shall provide a space bar (which generates
  the character <SPACE>), the ten decimal digits and the first 6
  letters of the Latin alphabet if the Latin script is used, or the
  first six letters of any other alphabet if a different script is
  used."

For those who haven't seen yet ISO 14755, in a nutshell it works like
this:

  1. Press and hold Ctrl and Shift
  2. Enter hexadecimal digits of UCS character
  3. Release Ctrl and Shift

If you enter a sequence of UCS characters this way and you get
tired of releasing and repressing Ctrl and Shift in order to
separate the hex numbers from each other, then you can also use
the space bar alternatively to start a new UCS character while
still pressing Ctrl and Shift.

There is also a way to enter the functional symbols on your keyboard.
For instance you want to enter the Shift-Tab symbol, which looks
like |<-- , then do as follows:

  1. Press Ctrl and Shift
  2. Release Ctrl and Shift
  3. Press Shift and Tab

So pressing Shift and Control and releasing it without any key pressed
in between brings the keyboard into a mode where the functional keys
lead to the symbols printed on them according to ISO 9995-7, for
instance ↲ for enter and ⇥ for tab.

ISO 14755 also says that it would be nice to have some on-screen tables
from which you can pick characters and that there should be ways to
find out the UCS hex code for any displayed character.

Sounds all rather reasonable to me, (except on further reflection for
the bit about non-Latin hex numbers).

I think it would be reasonable to expect every PC keyboard on this
planet to have a mode for entering at least a very basic ASCII subset
such as [A-Z0-9./-], otherwise entering email addresses, URLs, etc. will
be real φun.

Markus

-- 
Markus G. Kuhn, Computer Laboratory, University of Cambridge, UK
Email: mkuhn at acm.org,  WWW: <http://www.cl.cam.ac.uk/~mgk25/>

╔══╦══╗ ┌──┬──┐ ╭──┬──╮ ╭──┬──╮ ┏━━┳━━┓ ┎┒┏┑ ╷ ╻ ┏┯┓ ┌┰┐ ▊ ╱╲╱╲╳╳╳ ║┌─╨─┐║ │╔═╧═╗│ │╒═╪═╕│ │╓─╁─╖│ ┃┌─╂─┐┃ ┗╃╄┙ ╶┼╴╺╋╸┠┼┨ ┝╋┥ ▋ ╲╱╲╱╳╳╳ ║│╲ ╱│║ │║ ║│ ││ │ ││ │║ ┃ ║│ ┃│ ╿ │┃ ┍╅╆┓ ╵ ╹ ┗┷┛ └┸┘ ▌ ╱╲╱╲╳╳╳ ╠╡ ╳ ╞╣ ├╢ ╟┤ ├┼─┼─┼┤ ├╫─╂─╫┤ ┣┿╾┼╼┿┫ ┕┛┖┚ ┌┄┄┐ ╎ ┏┅┅┓ ┋ ▍ ╲╱╲╱╳╳╳ ║│╱ ╲│║ │║ ║│ ││ │ ││ │║ ┃ ║│ ┃│ ╽ │┃ ░░▒▒▓▓██ ┊ ┆ ╎ ╏ ┇ ┋ ▎ ║└─╥─┘║ │╚═╤═╝│ │╘═╪═╛│ │╙─╀─╜│ ┃└─╂─┘┃ ░░▒▒▓▓██ ┊ ┆ ╎ ╏ ┇ ┋ ▏ ╚══╩══╝ └──┴──┘ ╰──┴──╯ ╰──┴──╯ ┗━━┻━━┛ └╌╌┘ ╎ ┗╍╍┛ ┋ ▁▂▃▄▅▆▇█



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:46 EDT