Re: Pan-Turkic Alphabet of 1926, Latin letter like U+042C/U+044C or U+0184/U+0185

From: Richard Wordingham (richard.wordingham@ntlworld.com)
Date: Sun Apr 23 2006 - 13:41:30 CST

Next message: Richard Wordingham: "Re: Strange Behavior by Win IE 6 displaying bad UTF-8"

Previous message: Tom Gewecke: "Re: Strange Behavior by Win IE 6 displaying bad UTF-8"
In reply to: Karl Pentzlin: "Pan-Turkic Alphabet of 1926, Latin letter like U+042C/U+044C or U+0184/U+0185"
Next in thread: Anto'nio Martins-Tuva'lkin: "Re: Pan-Turkic Alphabet of 1926, Latin letter like U+042C/U+044C or U+0184/U+0185"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

Karl Pentzlin wrote on Sunday, April 23, 2006 at 6:30 PM
Subject: Pan-Turkic Alphabet of 1926, Latin letter like U+042C/U+044C or
U+0184/U+0185

> According to the sources:
> http://en.wikipedia.org/wiki/Janalif
> http://en.wikipedia.org/wiki/Uniform_Turkic_alphabet
> http://www.omniglot.com/writing/azeri.htm
> there is a Latin letter after the "i" in the Pan-Turkic alphabet looking
> like Cyrillic U+042C/U+044C (soft sign) which has the function of the
> dotless i in modern Turkish, Azeri and Tatar.
>
> This letter is not encoded in Unicode as such.
>
> Michael Everson states in "Some Türkmen alphabets"
> http://www.evertype.com/standards/iso10646/pdf/turkmen.pdf :
> Latin ?? are not encoded in the UCS, complicating things like
> monolingual multiscript ordering since the current UCS expects Cyrillic ??
> to do double duty. There are lots of Asian and Caucasian languages using
> this particular pair in multiple scripts.
>
> --
> 1. Is there a specific reason not to encode that letter,

It can be justified by the principal of separation of scripts, as
exemplified by the distinction of Latin 'o' and its Greek and Cyrillic
counterparts.

> especially
> as its similarity to the Cyrillic soft sign is only superficial
> (it is no soft sign - the "functionally next similar" Cyrillic
> letter is U+0428 ? not U+042C ?)

You clearly don't know Church Slavonic :) Seriously, though, the soft and
hard signs originally functioned as short vowels /i/ and /u/, which were
then mostly lost as the Slavonic languages developed. Thus to an English
speaker, it actually seems extremely appropriate!

> - as if not to encode Latin U+0058/U+0078 Xx as you can use
> Cyrillic U+0425/U+0445 ?? instead.

Do some research in the epichoric Greek alphabets, and you'll find your
suggestion is not as daft as it sounds. Some cities used the letter for
/kh/, some for /ks/. See
http://luna.cas.usf.edu/~murray/classes/cg/alphabet.htm , for example.
Again, though, the principle of script separation avoids confusion.

> 2. The Latin letters U+0184/U+0185 LATIN capital/small LETTER TONE SIX
> look very similar (except that the reference glyphs have a little
> left-pointing triangle at their top instead of a serif).
> Would it be a reasonable idea to unify the missing Latin letters ??
> with these?

At first sight that seems totally crazy, but I think it is actually
reasonable. These letters in the obsolete Zhuang writing system are
actually based on the digit '6'. However, it was considered wrong to use
the digit '6', and therefore the similarly shaped but significantly distinct
Cyrillic letter, soft yer, was used. Just look at the comments on it in
TUS - http://www.unicode.org/charts/PDF/U0180.pdf ! Key questions for this
unification would be:

1) Are the glyphs too distinct? - 'Considerable variation is to be expected
in actual fonts.'
2) Are we to think of letter tone six as soft jer transferred to the Latin
script?

The other question is, just how much trouble is caused by using the Cyrillic
soft jer as a Latin letter - there will be a disunification cost. The
sorting issue can get one short, sharp reply - Tailor your collation! Does
the Cyrillic soft yer occur as in 1-letter words in both scripts? If not,
tailoring can sense the script by contracting with an adjacent letter.
(Straight Russian can probably reasonably take pot luck.)

Richard.

Next message: Richard Wordingham: "Re: Strange Behavior by Win IE 6 displaying bad UTF-8"
Previous message: Tom Gewecke: "Re: Strange Behavior by Win IE 6 displaying bad UTF-8"
In reply to: Karl Pentzlin: "Pan-Turkic Alphabet of 1926, Latin letter like U+042C/U+044C or U+0184/U+0185"
Next in thread: Anto'nio Martins-Tuva'lkin: "Re: Pan-Turkic Alphabet of 1926, Latin letter like U+042C/U+044C or U+0184/U+0185"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.5 : Sun Apr 23 2006 - 13:46:18 CST