"We are right in the middle of the Unicode 3.1 release"
You have a new special casing file as part of the 3.1 beta and I thought
that this would be a part of 3.1.
That file defines FINAL as:
# FINAL: The letter is not followed by a letter of general category L*
(e.g. Ll, Lt, Lu, Lm, or Lo).
I noticed that you did add Azeri to the dotted/dotless i special casing.
# Turkish, Azeri
0049; 0131; 0049; 0049; tr; # LATIN CAPITAL LETTER I
0069; 0069; 0130; 0130; tr; # LATIN SMALL LETTER I
0049; 0131; 0049; 0049; az; # LATIN CAPITAL LETTER I
0069; 0069; 0130; 0130; az; # LATIN SMALL LETTER I
# Note: the following cases are already in the UnicodeData file.
# 0131; 0131; 0049; 0049; tr; # LATIN SMALL LETTER DOTLESS I
# 0130; 0069; 0130; 0130; tr; # LATIN CAPITAL LETTER I WITH DOT ABOVE
You might also add the following comment lines or state that cases that are
already in the Unicode file apply to both Turkish and Azeri.:
# 0131; 0131; 0049; 0049; az; # LATIN SMALL LETTER DOTLESS I
# 0130; 0069; 0130; 0130; az; # LATIN CAPITAL LETTER I WITH DOT ABOVE
Have you looked into Tatar (TT) & Bashkir (BA) LATIN SMALL LETTER DOTLESS
I/LATIN CAPITAL LETTER I WITH DOT ABOVE support?
I started looking into how official these Latin alphabets are but got busy
with other matters.
From: Mark Davis [mailto:firstname.lastname@example.org]
Sent: Saturday, March 03, 2001 7:27 AM
To: Unicode List
Subject: Re: Help with Greek special casing
Yes, that was filed as a bug, and will be fixed the next time we update the
case mappings. We are right in the middle of the Unicode 3.1 release, so
that will be coming sometime later.
----- Original Message -----
From: "Nick Nicholas" <email@example.com>
To: "Unicode List" <firstname.lastname@example.org>
Sent: Saturday, March 03, 2001 00:11
Subject: RE: Help with Greek special casing
> At 09:56 -0800 2001-03-01, Carl W. Brown wrote:
> >It looks like the Unicode TR 21 special casing rules for the Greek final
> >sigma are not quite right.
> >The final sigma in modern Greek should only be used at the end of a word
> >including the case where separate words are joined with hard hyphens. If
> >is followed by a character such as a combining mark or soft hyphen you
> >continue scanning to see what follows. If it is followed a letter then
> >is not final.
> >A simpler test might be it see if a letter or a spacing character or hard
> >hyphen is found first. If it is a letter then it is not a final sigma.
> Which is what we do at the TLG with Beta code (whose S is both medial or
> final); in fact, Beta code conflates hard hyphens and dashes anyway,
> considering the (em) dash, without space, punctuation.
> If the Unicode rules are wrong, well, I hope those that can fix them are
> tuned in. :-)
> Nick Nicholas, Thesaurus Linguae Graecae. email@example.com
> "All the nations also under his dominion were filled with joy and
> inexpressible gladness at not being even for a moment deprived of the
> benefits of a well ordered government."
> --- Eusebius of Caesaria on the accession of Constantine I.
This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:20 EDT