From: Kent Karlsson (email@example.com)
Date: Tue Jul 17 2007 - 04:33:40 CDT
Michael Maxwell wrote:
> Because when we are entering Indic script text
> (for example), I have found it very helpful to
> have something obvious appear on the screen that
> indicates I made a mistake at the level of the script.
To which Kent Karlsson replied:
> There is no error at "the level of the script".
I thought my meaning would be obvious, but apparently I was wrong.
By "error at the level of the script", I meant having a dependent character without any character for it to be dependent on.
In that case there is an implicit NBSP base.
A diacritic not preceded by a base character, or a Bengali (etc.) dependent vowel sign not preceded by a Bengali consonant.
(Assuming this case is meant to be disjoint from the previous case:) But there is still a base character for it. The base needn't be
in the Bengali script. Or there may be other combining characters (Bengali or not) between the base and the considered instance of a
One can imagine living in a parallel universe in which Unicode (and ISCII) represented Bengali (etc.) vowel signs and vowel letters
as alternative glyphs of a single character/ code point.
That would be just plain wrong. The combining ("dependent") vowels and the independent vowels look different, and behave
differently. It is NOT a matter of glyph variation.
In that case, I suppose a sequence of Bengali characters MA + O + O + O would be rendered as Bengali 'M' with the 'O' vowel sign to
its left and right, followed by two 'O' vowel letter glyphs.
That is a completely different kind of character string than the ones we are talking about.
(That's just a guess on my part of what the appropriate behavior would be, based on other vowel sequences I've seen in
Bengali--which are typed as vowel sign followed by vowel letter.)
But I don't (and I suspect you don't) live in that universe. I live in a universe in which vowel signs and vowel letters are
distinguished in Unicode as distinct code points. And so in my universe a sequence of vowel signs is just as bad as a diacritic
without a base character,
In that case there is an implicit NBSP base. Note that you can have multiple diacritics applied to a base character.
and it doesn't require a spell checker to know that. Hence an error at the level of the script (OK, to be technical, the script as
implemented in Unicode/ ISCII).
Deviating from the most common (or official) application of a script does not constitute an "error at the level of the script". If
I write moooose, I deviate from the common (official) application of the Latin script (and you can detect that without using a spell
checker). That does not make it an error "at the level of the script". That argument does not change just because the vowel
characters are combining characters
Putting it differently, a sequence of vowel signs would be just as bad in any other language using the Bengali (etc.)
script--Assamese, say. Whereas a spell checker would be particular to a certain language (and probably to a single writing system
for that language).
CASL/ U MD
This archive was generated by hypermail 2.1.5 : Tue Jul 17 2007 - 04:35:31 CDT