Re: Case-folding dotted i

From: Eric Muller <emuller_at_adobe.com>
Date: Tue, 29 Jan 2013 11:17:53 -0800

On 1/24/2013 2:15 AM, Richard Wordingham wrote:
> If text is going to be processed, i+dot is wrong for Turkish, as the
> Unicode casing rules for Turkish will capitalise it to I+dot+dot,
> which should display with two dots. If you're going to use an explicit
> dot, I'd have said <U+0131, U+0307> would be better, though I still
> think using an explicit dot is wrong in general. Richard.

Six abstract characters (hard dotted, dotless, soft dotted in 2 cases)
for four coded characters, something has to break somewhere.

With the current practice, there is inherent ambiguity.

The current practice is tolerable only in the presence of locale
information. In which case the addition of combining dots in case
transformation is useless, and in fact harmful as Richard showed.

Adding more characters (as in creating the hard dotted form by using the
dotless + combining dot) breaks current practice.

Eric.
Received on Tue Jan 29 2013 - 13:22:54 CST

This archive was generated by hypermail 2.2.0 : Tue Jan 29 2013 - 13:22:54 CST