RE: Obsolete characters

From: Richard Ishida (
Date: Thu Jan 22 2009 - 10:27:57 CST

  • Next message: Mark Davis: "Re: Obsolete characters"

    The reason I maintain UniView [1], which can be used as a general picker/character map, separately from an open-ended list of smaller, typically language-specific, pickers aimed at particular communities, (particularly for people unable to use keyboard input), such as Myanmar [2], IPA [3], Tlicho [4], Urdu [5] is:

    a. what one language speaker considers archaic or irrelevant, another valued as an important part of their repertoire

    b. even when dealing with a single language, different people need to get at the characters in different ways. So for example my more recent S/SE Asian pickers allow users to identify characters by proximity (a mixture of alphabetic and typing order)[1], purely by shape (including common conjuncts and ligatures)[6], by starting from a transcription[7], and other pickers arrange characters in other ways, eg. Tlicho[4] uses a keyboard layout (though that picker is still in beta), and the IPA picker[3] uses an arrangement similar to the common phonetic charts.

    So I don't think it's possible to provide a really useful tailored approach of this kind in a general picker, because you just can't be all things to all people in one place. I feel that separating out archaic characters is especially prone to the issues of how you decide what's one man's meat vs another's poison.

    On the other hand, with something like UniView you can search for things and produce tailored lists. Another thing you could do, though, is highlight characters in the picker, when viewing a given block, that are used in a given language. So, for instance, if you are looking at the Arabic block, provide a pull-down selection list that, without changing the set of characters in the block, highlights the characters *more commonly* used in a given language, such as Urdu, vs Persian vs. Sindhi, vs Kurdish, etc. (but not screen out the rest). (UniView allows you to highlight characters with particular properties in this way.) This is more informational/indicative than restrictive, though.

    The other thing UniView allows you to do, though, is read descriptions (which go beyond what's in the UCS character charts, where I've had time to develop or upload them [eg. in Myanmar block [8] click on the DB checkbox then on a character]). I think this descriptive information is more likely to be of great use to people using a general picker. For example, what character should I use for a hamza-on-a-chair in Urdu, which glottal stop should I use as a phonetician vs. a Na-Dene user, etc. There is some information on this in the Unicode charts, but it is very patchy. I'd much rather see effort put into that, since I regularly find myself wishing for it.

    Just my 2p.









    Richard Ishida
    Internationalization Lead
    W3C (World Wide Web Consortium)

    From: [] On Behalf Of Mark Davis
    Sent: 16 January 2009 05:07
    To: Asmus Freytag
    2. Independently, in doing a character picker (, we found it useful to put the archaic/obsolete characters in separate sections. This is work we are looking at at Google, but we're also making the data available so that others could use/tweek if they wish.

    This archive was generated by hypermail 2.1.5 : Thu Jan 22 2009 - 10:29:38 CST