RE: Help with Greek special casing

From: Carl W. Brown (cbrown@xnetinc.com)
Date: Sat Mar 03 2001 - 13:42:14 EST


Mark,

"We are right in the middle of the Unicode 3.1 release"

You have a new special casing file as part of the 3.1 beta and I thought
that this would be a part of 3.1.

http://www.unicode.org/Public/3.1-Update/SpecialCasing-4d1.beta.txt

That file defines FINAL as:

# FINAL: The letter is not followed by a letter of general category L*
(e.g. Ll, Lt, Lu, Lm, or Lo).

I noticed that you did add Azeri to the dotted/dotless i special casing.

# Turkish, Azeri

0049; 0131; 0049; 0049; tr; # LATIN CAPITAL LETTER I
0069; 0069; 0130; 0130; tr; # LATIN SMALL LETTER I

0049; 0131; 0049; 0049; az; # LATIN CAPITAL LETTER I
0069; 0069; 0130; 0130; az; # LATIN SMALL LETTER I

# Note: the following cases are already in the UnicodeData file.

# 0131; 0131; 0049; 0049; tr; # LATIN SMALL LETTER DOTLESS I
# 0130; 0069; 0130; 0130; tr; # LATIN CAPITAL LETTER I WITH DOT ABOVE

You might also add the following comment lines or state that cases that are
already in the Unicode file apply to both Turkish and Azeri.:

# 0131; 0131; 0049; 0049; az; # LATIN SMALL LETTER DOTLESS I
# 0130; 0069; 0130; 0130; az; # LATIN CAPITAL LETTER I WITH DOT ABOVE

Have you looked into Tatar (TT) & Bashkir (BA) LATIN SMALL LETTER DOTLESS
I/LATIN CAPITAL LETTER I WITH DOT ABOVE support?

http://rferl.org/bd/tb/tatar/TATAR/abs.html

I started looking into how official these Latin alphabets are but got busy
with other matters.

Carl

-----Original Message-----
From: Mark Davis [mailto:markdavis34@home.com]
Sent: Saturday, March 03, 2001 7:27 AM
To: Unicode List
Subject: Re: Help with Greek special casing

Yes, that was filed as a bug, and will be fixed the next time we update the
case mappings. We are right in the middle of the Unicode 3.1 release, so
that will be coming sometime later.

Mark

----- Original Message -----
From: "Nick Nicholas" <nicholas@uci.edu>
To: "Unicode List" <unicode@unicode.org>
Sent: Saturday, March 03, 2001 00:11
Subject: RE: Help with Greek special casing

> At 09:56 -0800 2001-03-01, Carl W. Brown wrote:
>
> >It looks like the Unicode TR 21 special casing rules for the Greek final
> >sigma are not quite right.
> >
> >The final sigma in modern Greek should only be used at the end of a word
> >including the case where separate words are joined with hard hyphens. If
it
> >is followed by a character such as a combining mark or soft hyphen you
must
> >continue scanning to see what follows. If it is followed a letter then
it
> >is not final.
> >
> >A simpler test might be it see if a letter or a spacing character or hard
> >hyphen is found first. If it is a letter then it is not a final sigma.
>
> Which is what we do at the TLG with Beta code (whose S is both medial or
> final); in fact, Beta code conflates hard hyphens and dashes anyway,
> considering the (em) dash, without space, punctuation.
>
> If the Unicode rules are wrong, well, I hope those that can fix them are
> tuned in. :-)
>
> Nick Nicholas, Thesaurus Linguae Graecae. nicholas@uci.edu
> www.tlg.uci.edu/~opoudjis
> "All the nations also under his dominion were filled with joy and
> inexpressible gladness at not being even for a moment deprived of the
> benefits of a well ordered government."
> --- Eusebius of Caesaria on the accession of Constantine I.
>
>



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:20 EDT