Re: Fwd: Wired 4.09 p. 130: Lost in Translation

From: Martin J Duerst ([email protected])
Date: Wed Aug 28 1996 - 11:17:02 EDT

Next message: Michael Everson: "Re: Fwd: Wired 4.09 p. 130: Lost in Translation"
Previous message: Alain LaBont/e'/: "Re: Fwd: Wired 4.09 p. 130: Lost in Translation"
In reply to: Alain LaBont/e'/: "Re: Fwd: Wired 4.09 p. 130: Lost in Translation"
Next in thread: Michael Everson: "Re: Fwd: Wired 4.09 p. 130: Lost in Translation"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

Alain wrote:

>At 07:20 28/08/1996 -0700, [email protected] wrote:
(that wasn't [email protected], but me :-).

>>Please be careful. To know whether an A is just only an A, you only have
>>to check the next position. If that next position is not a combining
>>character, you know it is an A, if it is a combining character, you
>>know it is "something else".
>
>My 2 cents... "you only have to check the next positionS", and the plural
>may be an unbounded finite number. It indeed makes softare more complex.

Alain - Please check the original mail by Michael Everson, or my mail,
where I have cited the relevant passage. To decide whether a Unicode A
is an A or something else, you indeed just have to look at the next
code. To decide whether it is an A-with-grave or an A-with-grave-and-
hook-below, for example, which is a different thing from what Michael
wrote, you have to look ahead by another position.

>In actuality Vietnamese uses up to 2 diacritics per character (so at least 4
>different codings are to be taken care of at once too for E CIRCUMFLEX WITH
>DOT BELOW TONE MARK, for example), I would say that some linguistic case
>might require up to 5 or 6... But everything is allowed, 1 million
>diacritics after A at the limit. Somebody has to decide to stop that
>look-ahead in actual applications. N is advised. In a speech that I gave at
>the 4th UNICODE Workshop in Germany in 1992 about ordering UNICODE and
>string comparison, I had set N to 3, but N should be parameterized in
>softare. But if one has the choice, he should encode fully composed
>characters as a preference, even under level 3 conformance, which is of
>course necessary to support (or to plan supporting at least), even if it is
>more complex.

The important thing is that for characters with N accents, you don't have
to look ahead by more than N+1 positions. And it is not the potential
number of accents that counts, but the actual number of accents present
in the current instance of the character.

Regards, Martin.

Next message: Michael Everson: "Re: Fwd: Wired 4.09 p. 130: Lost in Translation"
Previous message: Alain LaBont/e'/: "Re: Fwd: Wired 4.09 p. 130: Lost in Translation"
In reply to: Alain LaBont/e'/: "Re: Fwd: Wired 4.09 p. 130: Lost in Translation"
Next in thread: Michael Everson: "Re: Fwd: Wired 4.09 p. 130: Lost in Translation"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:31 EDT