Re: "Interoperability is getting better" ... What does that mean? from Naena Guru on 2013-01-08 (Unicode Mail List Archive)

From: Naena Guru <naenaguru_at_gmail.com>
Date: Tue, 8 Jan 2013 15:56:52 -0600

Thank you for commenting and Happy New Year.

CP-1252 is a perfectly legal web character set, and nobody is going to
argue with you if you want to use it in legal ways. (I.e. writing
Latin script in it, not Sinhala.) But .

Okay, what is implied is I am doing something illegal. Define what I am
doing that is illegal and cite the rule and its purpose of preventing what
harm to whom.

May I ask if the following two are Latin script, English or Singhala?

1. This is written in English.
2. mee laþingaþa síhalayi.

For me, both are Latin script and 1 is English and 2 is Singhala (says,'
this is romanized Singhala').

The fo;;owing are the *only* Singhala language web pages that pass HTML
validation (Challenge me):
http://www.lovatasinhala.com/
They are in romanized Singhala.

The statement,

the death of most character sets makes everyone's systems smaller and
faster

is *FALSE*. Compare the sizes of the following two files that are copies of
a newspaper article. The top part in red has few more words in romanized
Singhala in the romanized Singhala file. Notice the size of each file:
1. http://ahangama.com/jc/uniSinDemo.htm size:38,092 bytes
2. http://ahangama.com/jc/RSDemo.htm size:18,922 bytes
As the size of the page grows, the size of Unicode Sinhala tends to double
the size relative to its romanized Singhala version. Unicode Sinhala
characters become 50% larger when UTF-8 encoded for transmission That is
three times the size of the romanized Singhala file. So, the Unicode
Sinhala file consumes 3 times the bandwidth needed to send the romanized
Singhala file.

more likely to correctly show them the document instead of trash

Again *demonstrably WRONG*: Unicode Sinhala is trash in a machine that does
not have the fonts. It is trash also if the font used by the OS is
improperly made, such as in iPhone. It is generally trash because the
SLS1134 standard corrupts at least one writing convention. (Brandy issue).
On the other hand, romanized Singhala is always readable whether you have
the font or not. It is not helpful to criticize Singhala related things
without making a serious effort to understand the issues. Blind men thought
different things about the elephant.

If you mean that everyone should start using 16-bit Unicode characters, I
have no objection to that. It would happen if and when all applications
implement it. I cannot fight that even if I want to. But I do not see users
of English doing anything different to what they are doing now, like my
typing now, I think, using 8-bit characters. (I can verify that by copying
it and pasting into a text editor.

I showed that the Singhala can be romanized and all the problems of
ill-conceived Unicode Indic can be eliminated by carefully studying the
grammar of the language and romanizing. (I used the word 'transliterate'
earlier, but the correct word is transcribe). I did it for Singhala and
made an Open Type font to show it perfectly in the traditional Singhala
script. So far, one RS smartfont and six Unicode fonts even after spending
$20M for a foreign expert to tell how to make fonts though it is right on
the web in the same language the expert spoke in.

My work irritates some may be because it is an affront their belief that
they know all and decide all. Some feel let down why they could not think
of it earlier and may be write about a strange discovery like Abiguda and
write a book on the nonsense. Most of all, I think it is a just cultural
block on this side of the globe.

As for Lankan technocrats, their worry is that the purpose of ICTA would
come unraveled. I went there in November and it was revealed to me (by one
of its employees) that its purpose is to provide a single point of contact
for foreign vendors that can use local experts as their advocates.

On Thu, Jan 3, 2013 at 12:56 AM, Leif Halvard Silli <
xn--mlform-iua_at_xn--mlform-iua.no> wrote:

> Asmus Freytag, Mon, 31 Dec 2012 06:44:44 -0800:
> > On 12/31/2012 3:27 AM, Leif Halvard Silli wrote:
> >> Asmus Freytag, Sun, 30 Dec 2012 17:05:56 -0800:
> >> The Web archive for this very list, needs a fix as well …
> >
> >
> > The way to formally request any action by the Unicode Consortium is
> > via the contact form (found on the home page).
>
> Good idea. Done!
>
> Turned out to only be - it seems to me - an issue of mislabeling the
> monthly index pages as ISO-8859-1 instead of UTF-8. Whereas the very
> messages themselves are archived correctly. And thus I made the request
> that they properly label the index pages.
>
> Happy new year!
> --
> leif h silli
>
>
>
Received on Tue Jan 08 2013 - 16:00:22 CST

This archive was generated by hypermail 2.2.0 : Tue Jan 08 2013 - 16:00:23 CST