From: Peter Kirk (peter.r.kirk@ntlworld.com)
Date: Wed Aug 06 2003 - 17:53:45 EDT
On 06/08/2003 14:04, John Jenkins wrote:
> Speaking purely as an old fart, I'd say the former. We already break
> the latter principle in Thai and Lao, and having be prepared to scan
> either forward or backward from a base character in order to find its
> combining marks would add overhead to a lot of code, including
> existing code.
>
> On Wednesday, August 6, 2003, at 2:16 PM, John Cowan wrote:
>
>> I would like to ask the old farts^W^Wrespected elders of the UTC
>> which principle they consider more important, abstractly speaking:
>> the principle that combining marks always follow their base characters
>> (a typographical principle), or that text is stored, with a few minor
>> exceptions, in phonetic order (a lexicographical principle).
>>
>>
> ========
> John H. Jenkins
> jenkins@apple.com
> jhjenkins@mac.com
> http://homepage..mac.com/jhjenkins/
>
>
>
This answer presupposes that there is a well-defined concept of which
base character a combining mark belongs to. That is not always true. The
particukar combining mark which precipitated the debate may be situated
above the gap between the (logically and phonetically) preceding and
following characters, or may move on to the preceding or the following
characters depending on the precise context and on the typographer's
preference.
Anyway, John J, what code are we talking about that has to work from the
positions of the combining marks back to the underlying representation?
Are you talking about OCR?
-- Peter Kirk peter@qaya.org (personal) peterkirk@qaya.org (work) http://www.qaya.org/
This archive was generated by hypermail 2.1.5 : Wed Aug 06 2003 - 18:34:48 EDT