Re: Major Defect in Combining Classes of Tibetan Vowels

From: Jim Allan (
Date: Wed Jun 25 2003 - 15:35:38 EDT

  • Next message: "Re: Revised N2586R"

    Rick McGowan posted and was answered by John Hudson:

    >>If there isn't a visual difference here, how could there be a lexical
    >>difference? Imagine the age before computers. All you have to go on is
    >>what's on the page. There isn't an inherent order in those elements; they
    >>could have been written by the scribe in any order. If they appear the
    >>same, you can't assign different meanings -- except by some extra-syllabic
    >>informational context... right?
    > On the page, you would know -- or hopefully know -- from context. But a
    > search engine or a sorting algorithm looking at the characters presumably
    > needs to know the difference without additional context, hence the
    > character ordering is important.

    I think such distinctions are more than one should expect from a
    standard search engine or from simple sortation.

    To move to French, for example, I would not expect to be able to tell
    whether the abbreviation "M." in "M. Bouteillier" stands for "Monsieur"
    or a name like "Marcel".

    How do you know except from context whether "med." stands for "medical"
    or "medieval"?

    In a company name such as "Perrault & Lavigne" should "&" sort according
    to default Unicode or as "and" or as "et"?

    Should it be found from searches on "and", "et", "und" and so forth?

    This is the business of application protocol and application utilities.

    Indication of proper expansion of abbreviations for sorting and
    searching seems to me to be beyond what Unicode tries to do and what it
    can do reasonably.

    If lexical forms in any language have variant meanings, then they are
    not for Unicode to distinguish except occasionally when Unicode provides
    identical glyphs that represent characters with very different
    properties such as "!" for punctuation and "!" for a Zulu click in the
    hope, probably vain, that people in general will recognize the difference.

    Jim Allan

    This archive was generated by hypermail 2.1.5 : Wed Jun 25 2003 - 16:24:34 EDT