RE: Bangla: [ZWJ], [VIRAMA] and CV sequences

From: Gautam Sengupta (
Date: Sat Oct 11 2003 - 04:48:40 CST

--- Marco Cimarosti <>
> Hallo, Mr. Sengupta.
> > Now let's consider the same pair of inputs in *my*
> > representation. They would be K+R+VIRAMA+I and
> > J+VIRAMA+AA+I. All that the morphological analyzer
> would
> > have to do is chop off the rightmost <I>. The
> leftovers
> > are exactly what we need: K+R+VIRAMA and
> Well, it sounds a bit like saying that automobiles
> should have square wheels
> because that works much better when the car climbs
> over a staircase...

The analogy is a wee bit far fetched since you have
not shown that my wheels wouldn't run on ordinary
roads. :)

> I do agree that square wheels are better for
> climbing stairs, and that your
> representation is better for writing morphological
> analyzers (whatever in
> the world a "morphological analyzer" might be).

Well, machine-readable texts are of no use unless they
are going to be accessed by people through computer
programs. Let's say morphological analyzers constitute
a class of such programs that are very basic and
essential to all kinds of natural language processing
(that ought to ring a bell) tasks.
> However, you probably agree that climbing stairs is
> not exactly the primary
> purpose of automobiles, so why do you think that
> implementing morphological
> analyzers should be the primary purpose of a
> character encoding?

What exactly is the primary purpose of a character
encoding scheme? In all domains of scientific inquiry
there are certain principles for discriminating
between theories and choosing one above the other.
What is the evaluation metric for encoding schema? :-)

Best, Gautam.

Do you Yahoo!?
The New Yahoo! Shopping - with improved product search

This archive was generated by hypermail 2.1.5 : Thu Jan 18 2007 - 15:54:24 CST