Re: Hexadecimal in many scripts (ISO 14755)

From: Peter_Constable@sil.org
Date: Tue Jun 08 1999 - 17:30:45 EDT


I get the feeling this thread is going around in circles, touching on but never
hitting the key points. I sent the message below yesterday, but I haven't seen
any responses to it, and wonder if it landed in a vacuum somewhere. I thought it
provided some needed clarification to this discussion, though.

We've gotten onto telephone keypads originally because of the claim that they
have made 0-9 universal, but there has been some digression onto using a
telephone/ATM keypad for hex entry. I don't think that ever was the point. As
John Cowan reminded us:

>Anyhow, the point is to type Unicode characters on computer keyboards, not
telephone keypads.

People seem to have given up on the suggestion that be entered in terms of
decimal representations (no objections on my part). So the issue then is whether
to 'localise' hex codes. The issues here are:

- Does everyone in the world have a keyboard that has a Latin mode?
- Is switching to a Latin mode too much of a hassle?
- Even if switching to a Latin mode isn't too inconvenient, should a standard be
develop that accommodates such cultural differences anyway?
- Is the latter idea feasible?

Sandra Martin O'Donnell has also provided a reminder which touches on the first
of these issues:

>Throughout this voluminous thread, I've seen several people ask whether anyone
has a keyboard that does NOT include the Latin letters A-Z and a-z, or the
digits 0-9. So far I haven't seen an answer. It seems that information is needed
before worrying much about determining the first six "letters" in every
alphabet.

On the second issue, in my reckoning, we're talking about a need that should
only occur *infrequently*. (Read my message below to see why I assume this.) We
can talk about the inconvenience of switching keyboard modes in order to key a
Latin hex sequence, but if people are only doing that only infrequently, I don't
think that's a problem.

The third issue is open to lots of subjective opinion, but I've given some
reasons (see below) why I think this isn't necessary.

The fourth issue is something I've questioned in previous messages, and I won't
repeat those concerns here.

Summarising the way it all seems to me, we haven't seen evidence that Latin
*can't* be used by any group of users; I've presented arguments why using Latin
hex is not too inconvenient; I've presented arguments that using Latin hex is
not actually necessary; and I've raised several questions about the feasibility
of specifying a bunch of localised versions of hex digits. This is not to say
that the standards folks can't choose to pursue it anyway, but I would expect
them to decide where they stand on these issues before they get too far.

Peter

---------------------- Forwarded by Peter Constable/IntlAdmin/WCT on 06/08/99
04:51 PM ---------------------------

To: unicode@unicode.org
cc:
Subject: Re: Hexadecimal in many scripts (ISO 14755)

John Cowan wrote
>I chose U+2323 for a reason: it is very unlikely to appear on any keyboard, and
even fairly unlikely to appear in any list of "frequently used non-keyboard
characters" that a particular
IME may provide. It's just a freak, a highly specialized character that you
want when you want it and not otherwise.

Let's distinguish among the following:

1. writing systems that a person uses regularly
2. characters that an individual wants to use on an occasional but somewhat
regular basis
3. a character that an individual wants to use in an isolated instance

Situation 1 is typically, and generally should, be provided for by IMs made
available to the user, usually with the OS.

Situation 2 should best be met by user-definable IMs. John is right that there
will be characters that are unlikely to appear as part of any packaged IM but
that an individual may want to use regularly. These are generally specific to
the individual, and the preferred way of keying them is also specific to the
individual since there is (by definition) no standard keying. The number of
users that want to use unusual characters is not a majority, and this group
could probably be motivated to come up with their own IM definition if they were
given tools that made it easy enough to implement.

Situation 3 should be met by exactly the kind of mechanism being suggested by
ISO 14755. For this reason, I think the idea should be pursued. The open
questions are:

Q: Should codes be entered according to decimal or hexadecimal representations?

As has already been indicated, people are *very* unlikely to find decimal
representations in a table, unless they create the table themselves and do the
conversions. These should be hex.

Q: Should users be able to enter using transliterations of hex codes (what I've
referred to in earlier messages as "localising" hex)?

I think there are some significant concerns about such a plan, and I've
mentioned them previously. In addition, I think it's probably highly likely that
anyone who wants to enter an uncommon character into their document - someone
who has an awareness of the character, who has a wish to use it in a document,
and who has access to the standard - is the type of person who has a keyboard
that supports a Latin layout and/or is familiar with some Latin layout. (This
kind of person surely works with Latin - at least, they need to be able to read
the standard and/or type in the URL to the charts on the Unicode web site, and
they probably work with a lot of other Latin URLs as well.) My guess is that
this person would probably prefer to enter hex codes in terms of the actual
Latin characters, even if they had a way to enter the code in terms of a
transliteration of hex.

Between the problems involved in transliterating hex into other scripts and the
question about whether it is really needed or would be used, I think this part
of the plan needs to be thought through much more carefully before it is
implemented.

Peter



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:46 EDT