From: Philippe Verdy (verdy_p@wanadoo.fr)
Date: Sat Mar 20 2004 - 06:02:15 EST
From: "Charles Cox" <charles@agenoria.fsnet.co.uk>
> Curtis Clark asked:
> > Are there any languages that use letters with diacriticals, but *never*
> > use the base letter without diacriticals? A made-up example to explain
> > what made me think of it: Let's say a language has "ö", to represent the
> > same sound that it does in German, but not "o", because the language
> > lacks the sound represented by that letter in common European languages
> > (the alternative being to use "o" to represent the "ö" sound).
>
> I believe Maltese uses "c" with a dot above but doesn't use the basic "c".
Does a maltese keyboard requires the user to enter two keystrokes instead of
just pressing the "C" key? Or does it map a "c with dot above" separate key?
I think that for such languages, there's a common folding rule that allows
collating together the dotted and undotted c as if they were both the same
letter, so that it allows automatic spelling correction.
It looks exactly like the folding rule that a Irish collation table and folding
rule could produce to unify two possible encodings of the 'i' vowel.
For Unicode, these pairs of letters are distinct, but nothing forbids a
language-specific collation and folding rule to equate letters that it considers
equivalent simply because two encodings are quite common to work with various
working environments where there's a huge legacy use of the ASCII letter.
Each time there's a huge legacy usage of an 'incorrect" spelling produced by
legacy encodings or limited keyboards, there's an opportunity to "correct" the
spelling with such language-specific folding and collation rules...
Unicode already lists some of those possible foldings, but a more extensive
search in various languages could list a lot of useful foldings that would help
solve the problems caused by encoding differences that are not really
orthographic differences... See for example the foldings commonly used in Asian
languages with wide and narrow variants of letters in association with the
Japanese or Korean scripts.
This archive was generated by hypermail 2.1.5 : Sat Mar 20 2004 - 06:52:45 EST