Re: Major Defect in Combining Classes of Tibetan Vowels

From: Jim Allan (jallan@smrtytrek.com)
Date: Wed Jun 25 2003 - 15:35:38 EDT

Next message: Peter_Constable@sil.org: "Re: Revised N2586R"

Previous message: Michael Everson: "Re: Major Defect in Combining Classes of Tibetan Vowels"
Maybe in reply to: Christopher John Fynn: "Major Defect in Combining Classes of Tibetan Vowels"
Next in thread: Kenneth Whistler: "Re: Major Defect in Combining Classes of Tibetan Vowels"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

Rick McGowan posted and was answered by John Hudson:

>>If there isn't a visual difference here, how could there be a lexical
>>difference? Imagine the age before computers. All you have to go on is
>>what's on the page. There isn't an inherent order in those elements; they
>>could have been written by the scribe in any order. If they appear the
>>same, you can't assign different meanings -- except by some extra-syllabic
>>informational context... right?
>
> On the page, you would know -- or hopefully know -- from context. But a
> search engine or a sorting algorithm looking at the characters presumably
> needs to know the difference without additional context, hence the
> character ordering is important.

I think such distinctions are more than one should expect from a
standard search engine or from simple sortation.

To move to French, for example, I would not expect to be able to tell
whether the abbreviation "M." in "M. Bouteillier" stands for "Monsieur"
or a name like "Marcel".

How do you know except from context whether "med." stands for "medical"
or "medieval"?

In a company name such as "Perrault & Lavigne" should "&" sort according
to default Unicode or as "and" or as "et"?

Should it be found from searches on "and", "et", "und" and so forth?

This is the business of application protocol and application utilities.

Indication of proper expansion of abbreviations for sorting and
searching seems to me to be beyond what Unicode tries to do and what it
can do reasonably.

If lexical forms in any language have variant meanings, then they are
not for Unicode to distinguish except occasionally when Unicode provides
identical glyphs that represent characters with very different
properties such as "!" for punctuation and "!" for a Zulu click in the
hope, probably vain, that people in general will recognize the difference.

Jim Allan

Next message: Peter_Constable@sil.org: "Re: Revised N2586R"
Previous message: Michael Everson: "Re: Major Defect in Combining Classes of Tibetan Vowels"
Maybe in reply to: Christopher John Fynn: "Major Defect in Combining Classes of Tibetan Vowels"
Next in thread: Kenneth Whistler: "Re: Major Defect in Combining Classes of Tibetan Vowels"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.5 : Wed Jun 25 2003 - 16:24:34 EDT