Re: "First six ideographs" problem considered bogus

From: Chun-shek Charles Chan (chunshek@hotbot.com)
Date: Wed Jun 09 1999 - 23:40:44 EDT


One note on addressing this issue to Bopomofo:

Under current Traditional Chinese bopomofo keyboard layout (used in Taiwan and China), two of the the first six Bopomofo syllaburies (bo and de) map to the first row of the keyboard (ie. the number keys). So when a user wants to input hex Unicode values, he will HAVE to switch back to Latin mode, OR use the numeric keypad together with the six keys on the side as described a while ago on this thread.

However, well-designed Chinese programs have always known to _automatically_ switch back to Latin mode when half-width data is needed, and _automatically_ switch back to the original mode once the input is finished -- this includes the field for font size in word processors and password mechanisms.

Also, the bopomofo has been declared obsolete in China and Singapore since the Simplified Chinese uses romanized phonetics.

Since neither the bopomofo nor the romanized phonetics depend on casing (in fact the bopomofo has no casing), Input Methods Editors have manipulated codings such that when the user wants to input an occasional latin letter, he can do so by simply pressing SHIFT and the letter without the need to switch back and forth.

Similar techniques can be used if someone desperately wants to implement a method for entering hex Unicode values for the Chinese users. But as a Chinese speaker, It would sound a bit too insensible to me if the bopomofo are used to represent hex values instead of the latin letters A-F.

Just my two cents here.

---
Chun-shek Charles CHAN [LPCUWC 98-00]

Email add: chunshek@hotbot.com

--- DATE: Wed, 9 Jun 1999 09:01:24 From: John Cowan <cowan@locke.ccil.org> To: Unicode List <unicode@unicode.org>

The text of IS 17455 says:

# As there are no digits available beyond 9, the first 6 letters # of the Latin alphabet (or of any alphabet if the Latin script # is not used) are used to represent the extra hexadecimal # "digits" [...].

Han script is not an alphabet, and need not be handled by an implementor of this standard, however internationalized. Therefore, there is an issue only for the Western scripts (Greek, Cyrillic, Armenian, Georgian), the West Asian scripts (Arabic, Hebrew), and the South Asian scripts (Devanagari, Bengali, Gurmukhi, Gujarati, Oriya, Tamil, Telugu, Kannada, Malayalam, Thai, Lao, Tibetan), plus Bopomofo.

For each of these, there is a canonical list of basic letters and a canonical ordering. ("Ch" may be a letter in Spanish for some purposes, but it is not a *basic* letter.)

Hiragana and Katakana are syllabaries, but can be handled by the same principles.

This list needs extending for Unicode 3.0.

-- John Cowan http://www.ccil.org/~cowan cowan@ccil.org You tollerday donsk? N. You tolkatiff scowegian? Nn. You spigotty anglease? Nnn. You phonio saxo? Nnnn. Clear all so! 'Tis a Jute.... (Finnegans Wake 16.5)

--------- End Forwarded Message ---------

HotBot - Search smarter. http://www.hotbot.com



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:46 EDT