Re: Tamil glyphs

From: Marco Cimarosti (marco.cimarosti@europe.com)
Date: Wed Sep 13 2000 - 20:29:29 EDT


Antoine Leca wrote:
> > 1) <consonant + virama + ZWJ> should render the "half[...]
> Yes, and irrelevant on this matter (but I shall return on it
> later).

I admit. It was the first chain of loooong example.

> Paragraph 6 page 214, titled "explicit virama", says: "[...]
> placing the character U+200C zero width non joiner
> immediately after the encoded dead consonant that is to be
> excluded from conjunct formation."

Ugh... Sorry, my internal English parser just raised an exception.

> So that could be read a bit more heavier than just a rendering
> glyph issue, but rather affects the process of using
> conjuncts.

The author here is talking about rendering, so I assume that "conjunct
formation" refers to the process of selecting a single glyph. The term
"consonant *cluster*" would have probably been used instead for referring to
a logical (or phonetic, that is) conjunct.

But your point is 100% correct!

These are quite important matters for implementers, and they cannot be
settled by what I assume on what she meant in the passage that you cited, on
the basis of the different statistical distribution of synonyms in a similar
context...

> [...]
> R15 continues with "If the syllable contains a consonant
> cluster, then this vowel is always depicted to the left [...]

"Consonant cluster", you see...

> Can of worms, can of worms...

Poisonous worms, they are.

> [...] the i vowel sign should reorder around the
> whole cluster.

> This is well understood (and the latter point is specific to
> Devanagari and related scripts; Tamil as well as Oriya behaves
> differently here).

You're right again. And, incidentally, is this explained on the book? I may
be wrong (1nce again, I don't have the book with me), but I think I've
learned it from this list, or somewhere else in Internet.

> And on the other hand, ZWJ should otherwise retains its normal
> behaviour, which is described page 215 as (sorry, I quote from
> memory) preventing use of specific ligature or cluster when
> available. [...]
> [...]
> I agree with your idea, but using ZWJ instead.

Barrel of worms! How does this match with the recently defined additional
meaning of ZWJ as "zero width ligator"? It forms ligatures in Fraktur but
splits them in Tamil? (proposal for an new alternative name: "zero width
schyzophrenic":-).

> We agree this is an area where we really need some light, and
> a firmer guide of implementation from the Unicode consortium.
> What is the way to request a more strong rule of
> interpretation?

This is the only clear thing, especially if you consider that we have been
talking almost exclusively about Devanagari and Tamil, that are the two
well-documented Indic scripts in the book.

I hope that Unicode will soon issue something. Maybe one or more UTR's
covering the biggest gaps. E.g.:

- A (normative?) algorithm for "Indic reordering", with all script-specific
variants;

- A clarification of the intended behavior of ZW[N]J in Indic scripts;

- Non-normative lists of glyphs needed for "other" Indic scripts, on the
model of the existing chapters for Devanagari and Tamil.

If Unicode does not provide this documentation fast, some well-known
computer firm will provide it, and it will consist of only two words, the
first one being "buy" and the second one a product name.

_ Marco

______________________________________________
FREE Personalized Email at Mail.com
Sign up at http://www.mail.com/?sr=signup



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:13 EDT