Latin Letters Capital and Small Theta

From: Marcel Schneider <charupdate_at_orange.fr>
Date: Sun, 12 Jun 2016 00:20:12 +0200 (CEST)

People are facing the recurrent idea that the Greek theta used to
write the Rromani language in International Standard orthography—as
well as a number of other languages—will be or ought to be encoded
as a separate casing pair in Unicode.

LATIN CAPITAL LETTER THETA and LATIN SMALL LETTER THETA
were part of Michael Eversonʼs 2012 proposal at
http://www.unicode.org/L2/L2012/12138-n4262-unifon.pdf
as the intended code points U+A7B0 and U+A7B1. While some characters
were retained, others were rejected, among which the Latin Theta pair,
but no mention is found of this rejection in the Non-Approval Notices.

Two years later this proposal was sustained by
Denis Moyogo Jacqueryeʼs additional proposal at
http://www.unicode.org/L2/L2014/14202-latin-theta-delta.pdf
with a new rationale, as being required in writing systems of several
natural languages.

On the sole criterium of glyphic resemblance there exist already
two matching characters in Unicode:
03F4 GREEK CAPITAL THETA SYMBOL
03B8 GREEK SMALL LETTER THETA

Does the UTC consider it as feasible to meet the issue by implementing
a tailored casing pair for the related locales, and adding somewhere an
annotation for the information of font designers, or can people expect to
see one day a successful proposal for LATIN CAPITAL LETTER THETA and
LATIN SMALL LETTER THETA? Yet to date, this is not found in the Pipeline.
(Though experience showed that a given character being rejected in one
proposal is without prejudice to its being accepted as a part of a later
proposal. That happened to the LATIN CAPITAL LETTER SMALL CAPITAL I, found
already in Mr Eversonʼs 2012 proposal and now added to Unicode in 2016.)

The Greek Theta as an IPA character was incidentally discussed already in
the following thread:
Unicode Mail List Archive: gamma as a phonetic symbol.
(Sat Sep 27 2008 - 11:43:57 CDT). Retrieved June 10, 2016, from
http://www.unicode.org/mail-arch/unicode-ml/y2008-m09/0072.html

According to Mr Everson in this thread, «Theta is perhaps the
hardest to argue for» disunification:
http://www.unicode.org/mail-arch/unicode-ml/y2008-m09/0076.html

Why so, is however non-obvious to me because the capital does not
match the glyphic expectations for the Romani International Standard
Latin script subset as referred to in
https://en.wikipedia.org/wiki/Romani_alphabets#International_Standard
and more detailedly in
https://fr.wikipedia.org/wiki/Th%C3%AAta_latin
(available yet in French only, but anyway one might wish to check
the picture).

Consequently AFAIK to date the Greek Capital Theta Symbol is preferred
as uppercase, not the Greek Capital Theta. Using the Symbol variant
brings some odds in data processing due to the lack of round-trip casing
relationship. This adds to the overall problem of cross-script usage.
Using several scripts to write one language contradicts one of the design
principles of Unicode.

I note too, that in its International Standard Alphabet form, Romany is not
supported by the blocks up to Latin Extended-A, unlike TUS 8.0 states on
page 296. This brings up the need to underscore that Unicode added the
H with háček (U+021E U+021F) for Finnish Romany in the Latin Extended-B

block.

However U+03F4 ( ϴ ) GREEK CAPITAL THETA SYMBOL was among the
subset of potentially obsolete characters found in the Archives of
this List in the following e-mail:
http://www.unicode.org/mail-arch/unicode-ml/y2009-m01/0558.html

Solving this issue now is important in that the French Standard
Keyboard Layout will support Rromani Standard Latin script (along
with all European Latin script using languages). This topic being
about plain character encoding, Iʼve finally decided to submit it
to your kind advice.

Marcel
Received on Sat Jun 11 2016 - 17:20:34 CDT

This archive was generated by hypermail 2.2.0 : Sat Jun 11 2016 - 17:20:34 CDT