Re: Counting Devanagari Aksharas from Richard Wordingham via Unicode on 2017-04-21 (Unicode Mail List Archive)

From: Richard Wordingham via Unicode <unicode_at_unicode.org>
Date: Sat, 22 Apr 2017 00:04:27 +0100

On Thu, 20 Apr 2017 11:17:05 -0700
Manish Goregaokar via Unicode <unicode_at_unicode.org> wrote:

> On Wed, Apr 19, 2017 at 4:35 PM, Richard Wordingham via Unicode
> <unicode_at_unicode.org> wrote:

> > Is there consensus on how to count aksharas in the Devanagari
> > script? The doubts I have relate to a visible halant in
> > orthographic syllables other than the first.

> I don't think there's consensus.

I've found related discussion at
https://lists.w3.org/Archives/Public/public-i18n-indic/. The question
of how to count was raised and not answered there.

> On Wed, Apr 19, 2017 at 4:35 PM,
> Richard Wordingham via Unicode <unicode_at_unicode.org> wrote:
> > Is there consensus on how to count aksharas in the Devanagari
> > script? The doubts I have relate to a visible halant in
> > orthographic syllables other than the first.

> I'm of the opinion that Unicode should start considering devanagari
> (and possibly other indic) consonant clusters as single extended
> grapheme clusters.

Do Hindi speakers really think of orthographic syllables as characters?

What may be useful is the concept of a definition of an orthographic
syllable. It may be possible to get the information from a font -
depending on the renderer - but a locale-dependent definition should be
possible for use as a fall-back. Devanagari rules won't work for
Tamil, and I think rules for Hindi and Nepali will be slightly
different - <VIRAMA, ZWNJ> looks like a problem.

The concept is possibly not useful in some Indic scripts - the concept
won't work well in Thai, but will work in Pali in the Thai script, for
both Pali orthographies.

Richard.
Received on Fri Apr 21 2017 - 18:05:02 CDT

This archive was generated by hypermail 2.2.0 : Fri Apr 21 2017 - 18:05:04 CDT