Re: Preliminary proposal to encode Unifon in the UCS. from Jean-François Colson on 2012-05-31 (Unicode Mail List Archive)

From: Jean-François Colson <jf_at_colson.eu>
Date: Thu, 31 May 2012 20:33:24 +0200

Hello

I wrote: “1st possibility: a separate script. There’ll be no problem.”
You wrote: “There would, because the bulk of the script would look just
like Latin, and the encoding committees consider this to be a security
issue for internet spoofing for instance.”
I don’t understand.
Internet spoofing would be possible for example by mixing Latin and
Cyrillic letters in internationalized domain names. For example, instead
of paypal.com, you could take advantage of the fact that the first five
letters all have looking alike Cyrillic letters and register one of the
31 (2⁵-1) DIFFERENT domain names paypаl.com, payрal.com, payраl.com,
paуpal.com, paуpаl.com, paурal.com, paураl.com, pаypal.com, pаypаl.com,
pаyрal.com, pаyраl.com, pауpal.com, pауpаl.com, pаурal.com, pаураl.com,
рaypal.com, рaypаl.com, рayрal.com, рayраl.com, рaуpal.com, рaуpаl.com,
рaурal.com, рaураl.com, раypal.com, раypаl.com, раyрal.com, раyраl.com,
рауpal.com, рауpаl.com, раурal.com or раураl.com to ask their paypal
e-mail and password to your “customers”. That could only work if the
said customer is very distracted or if he has previously typed
“about:config” in the address bar and set network.IDN_show_punycode to
false. (That works with Firefox. The way to do it could be different
with other browsers.)
But, as far as I know, the domain names are commonly written in
lowercase. When I type in capital a domain name which doesn’t exist,
such as CUYOPUIESVRDKRSIXTVESVRDSHKSE.com, it is automatically converted
in lowercase (http://www.cuyopuiesvrdkrsixtvesvrdshkse.com/) before the
“not found” message is displayed.
In Unifon, only the capital letters would look alike. The lowercase
letters would be different. There could be a problem with the letter o,
but that would be a drop in the ocean, not more problematic than the
letter ᴏ (small capital o), ο (Greek omicron), о (Cyrillic o), ⲟ (Coptic
o), 𐐬 (Deseret o), ჿ (Georgian labial sign), ੦ (Gurmukhi zero), all the
zeros, most of which look like circles, etc.
What exactly is the real security issue with Unifon as a separate
script? Some one who wants to spoof will find a way to do it without that.

NOW, a few comments about the Unifon proposal.

You didn’t correct “for several the Hupa, Yurok, Tolowa, and Karok
languages”.
There’s also the word “Karok”. Below, you write “Karuk”.

In the Unifon letters unified with existing characters, you forgot the
letter I.

You propose a Latin capital letter small capital i to be paired with ɪ
(Latin letter small capital i). Would ɪ have wider serifs when displayed
in small caps?

For the Latin capital beta, you wrote: “The unique Latin capital form
meets one of the major criteria for disunification.”
Could I use the same formula for Unifon? The unique Unifon small forms
meet one of the major criteria for disunification…

In the previous proposal, you also included a letter which looked a
little like a ƆC ligature or a rounded X. You called it zhay in n4195.
Have you forgotten it deliberately? That’s the last letter in figure 1,
although you wrote X in the caption.

You also used an X in Figure 7’s caption: it would be strange to have an
X pronounced /ʒ/ (zh) in a phonemic alphabet for English.

In the first three columns of the table at page 12, the two parts of
Latin letter oy are detached. In all samples of Unifon I’ve seen which
use that letter, the vertical line of the turned Ⱶ is tangent to the
right of the O.

In the same table, the Latin letter dhe should have a round shape.
That’s one of the two features which permit to distinguish it from the
Latin letter the.
In all Unifon fonts I know except one, the left part of the letter dhe
is not really a T but something midway between a T and a Γ.

I think Latin letter the should have a small top bar.

In this table of the Tolowa Unifon alphabet,
http://unifon.org/images/TOLOWA.jpg , some letters have a different
value when followed by a small stroke which looks like an apostrophe.
Should it be an ASCII apostrophe, a ’ (U+2019), a ʼ (U+02BC), a Ꞌ
(saltillo) or something else?

On page 3, the capital ʃ looks like an enlarged form of the lowercase
letter, different from the Greek capital sigma-like Ʃ. Would the unique
Latin capital form meets one of the major criteria for disunification.
What about the capital U with a tail?

I wonder whether the 8th letter of the 42-letter “Indian Unifon
Single-Sound Alphabet” is a turned or a reversed C.

For the turned e-r, I think a new lower case is needed.

For the Latin letter reversed-e e, could the double ϵ, used for the same
sound in the Initial Teaching Alphabet, be used as a lower case letter?

Would a separate proposal be required for the Initial Teaching Alphabet
(http://en.wikipedia.org/wiki/Initial_Teaching_Alphabet)?
28 or 29 letters of this 44 letter alphabet are already supported:
b, c, d, f, ɡ, h, j, k, l, m, n are already supported.
ng ligature is different from ŋ.
p, r, s are already supported.
reversed z is proposed for Unifon.
t, v, w, y, z, ʒ are already supported.
ʗh ligature
ʃh ligature
ʈh ligature
reversed t-h ligature
wh ligature
a, e, i, o, u are already supported.
omega
ɑ, æ are already supported.
au ligature
ϵϵ ligature
œ is already supported.
omega with curl
ᵫ is already supported, but the I.T.A. version is a little different.
ie ligature
oi ligature
ou ligature

JF
Received on Thu May 31 2012 - 13:35:31 CDT

This archive was generated by hypermail 2.2.0 : Thu May 31 2012 - 13:35:33 CDT