UCD 3.1, Final Beta - Case folding

From: Carl W. Brown (cbrown@xnetinc.com)
Date: Mon Mar 05 2001 - 11:19:46 EST


Mark,

Is case folding part of UCD 3.1? It also is part of the files submitted
with 3.1.

I noticed that there is no mention of the casing special case:

# Lithuanian

0307; 0307; ; ; lt AFTER_i; # Remove DOT ABOVE after "i" with upper or
titlecase

The case folding is locale-less so it seems to me the it is probably better
to remove the COMBINING DOT ABOVE after all 'i' / 'I' regardless of locale
to make it work for Lithuanian. I doubt that this will case serious
problems with caseless compares for other locales.

Also too late for 3.1 but you should consider for future versions:

Caseless compares are used for comparing charater values regardless of form.
You include some small forms but not all. For example small form variants
FE50-FE6F are not included.

Also not included are the small form Katakana vowels 30A1, 30A3, 30A5, 30A7,
30A9, 30C3,
30E3, 30E5, 30E7

I also seems to me that if folding is designed to produce essential
characters you should also fold halfwigth and fullwidth forms.

FF21; C; FF41; # FULLWIDTH LATIN CAPITAL LETTER A

should be:

FF21; C; 0061; # FULLWIDTH LATIN CAPITAL LETTER A

And

FF41; C; 0061; # FULLWIDTH LATIN SMALL LETTER A

Why do you only fold some persentation forms? You fold Latin and Armenian
ligatures FB00 - FB17 but FB1D to FDFF (Hebrew & Arabic) are not.

Carl

-----Original Message-----
From: Mark Davis [mailto:mark.davis@us.ibm.com]
Sent: Friday, March 02, 2001 5:03 PM
To: Unicode List
Subject: UCD 3.1, Final Beta Review Period

The latest versions of the files in the Beta Unicode Character Database 3.1
are available for public review. See:

http://www.unicode.org/unicode/standard/versions/beta-ucd31.html
http://www.unicode.org/Public/3.1-Update/
(or ftp://www.unicode.org/Public/3.1-Update/)

There are a number of changes from the previous version of the UCD. For
more information, see UTR #27, Unicode 3.1, and the
UnicodeCharacterDatabase html file. We strongly encourage everyone to the
data in these files with your products, and report any errors.

The final beta review period for UCD 3.1 closes on March 14th, 2001 at
12:00 GMT. Please report any bugs by that date.



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:20 EDT