Re: Level of Unicode support required for various languages

From: John H. Jenkins (
Date: Fri Oct 26 2007 - 11:04:03 CDT

  • Next message: John H. Jenkins: "Re: Level of Unicode support required for various languages"

    On Oct 26, 2007, at 8:14 AM, Mark E. Shoulson wrote:

    > Yeah, and an "x" in English has a different meaning (sound) than an
    > "x" in Spanish (letters "mean" sounds; Chinese graphs mean words.
    > More or less). Yet we still encode them the same because they look
    > the same. Unicode generally tries to code what's written more than
    > what's meant, I thought.

    Well, not really.

    Unicode tries to formalize the informal understanding that users of a
    script bring to it. In the case of x, "everybody" knows that it's the
    same letter in English as in Spanish. In East Asia, there are a
    number of cases where "everybody" knows that two entities are separate
    characters even if they look almost the same and in fact may be
    indistinguishable in practice.

    It gets complicated, of course, because my "everybody" may disagree
    with your "everybody," and technical limitations impose themselves,
    and so on and so on. But this is largely why Han has the "non-
    cognate" rule--in practice, East Asian lexicographers have been using
    it for centuries when preparing dictionaries and other authoritative
    character lists.

    John H. Jenkins

    This archive was generated by hypermail 2.1.5 : Fri Oct 26 2007 - 11:05:18 CDT