Re: Romanized Singhala - Think about it again

From: Naena Guru <>
Date: Thu, 5 Jul 2012 03:02:30 -0500

On Wed, Jul 4, 2012 at 11:33 PM, Philippe Verdy <> wrote:

> Anyway, consider the solutions already proposed in Sinhalese
> Wikipedia. There are verious solutions proposed, including several
> input methods supported there. But the purpose of these solutions is
> always to generate Sinhalese texts perfectly encoded with Unicode and
> nothing else.
Thank you for the kind suggestion. The problem is Unicode Sinhala does not
perfectly support Singhala! *The solution is for Sinhala not for Unicode! *I
am not saying Unicode has a bad intention but an ill-conceived product. The
fault is with Lankan technocrats that took the proposal as it was given and
ever since prevented public participation. My solution is 'perfectly
encoded with Unicode'.

> Yes thee may remain some issues with older OSes that have limited
> support for standard OpenType layout tables. But there's now no
> problem at all since Windows XP SP2. Windows 7 has the full support,
> and for those users that have still not upgraded from Windows XP,
> Windows 8 will be ready in next August with an upgrade cost of about
> US$ 40 in US (valid offer currently advertized for all users upgrading
> from XP or later), and certainly even less for users in India and Sri
> Lanka.
The above are not any of my complaints.
Per Capita Income in Sri Lanka $2400. They are content with cell phones.
The practical place for computers is the Internet Cafe. Linux is what the
vast majority needs.

> And standard Unicode fonts with free licences are already available
> for all systems (not just Linux for which they were initially
> developed);

Yes, only 4 rickety ones. Who is going to buy them anyway? Still Iskoola
Pota made by Microsoft by copying a printed font is the best. You check the
Plain Text by mixing Singhala and Latin in the Arial Unicode MS font to see
how pretty Plain text looks. They spent $2 or 20 million for someone to
come and teach them how to make fonts. (Search Staying friendly
with them is profitable. World bank backs you up too.
Sometime in 1990s when I was in Lanka, I tried to select a PC for my
printer brother. We wanted to buy Adobe, Quark Express etc. The store
keeper gave a list and asked us to select the programs. Knowing that they
are expensive, I asked him first to tell me how much they cost. He said
that he will install anything we wanted for free! The same trip coming
back, in Zurich, the guys tried to give me a illicit copy of Windows OS in
appreciation for installing German and Italian (or French?) code pages on
their computers.

there even exists solutions for older versions of iPhone
> 4. OR on Android smartphones and tablets.
Mine works in them with no special solution. It works anywhere that
supports Open Type -- no platform discrimination

> No one wants to get back to the situation that existed in the 1980's
> when there was a proliferation of non-interoperable 8 bit encodings
> for each specific platform.
I agree. Today, 14 languages, including English, French, German and Italian
all share the same character space called ISO-8859-1. Romanized Singhala
uses the same. So, what's the fuss about? The font? Consider that as the
oft suggested IME. Haha!

> And your solution also does not work in multilingual contexts;

If mine does not work in some multilingual context, none of the 14
languages I mentioned above including English and French don't either.

it does
> not work with many protocols or i18n libraries for applications.

i18n is for multi-byte characters. Mine are single-byte characters. As you
see, the safest place is SBCS.

Or it
> requires specific constraints on web pages requiring complex styling
> everywhere to switch fonts.

Did you see May be you are confusing Unicode
Sinhala and romanized Singhala. Unicode Sinhala has a myriad such problems.
That is why it should be abandoned! Please look at the web site and say it
more coherently, if I misunderstood you.

> Plain text searches in mutliingual pages
> won't work. Usability tools won't work.
Have you tried to search a vowel in Unicode Sinhala? Romanized Singhala has
no search problem. Try it in the my web site.

> Really consider abandonning the hacked encoding of the Sinhalese
> script itself.

There is no re-encoding of Singhala. Singhala is transcribed into Latin!
 When I say Singhala, I don't mean Unicode Sinhala. It is the Singhala
phoneme inventory that was transliterated.

It will however be more valuable if you just
> concentrate on creating a simpler romanization system. that will use
> standard Unicode encoding of Latin

This is exactly what I did. Have I been talking to someone who did not know
what he was evaluating?

(note that you are absolutely not
> limited to the reduced ISO 8859-1 subset for Latin and that there's
> already a much richer set of letters, symbols and diacritics for all
> needs ; but here again this requires using Unicode and not just ISO
> 8859-1).

Oh, thank you for the generosity of allowing me use of the entire Latin
repertoire. You don't have to tell that to me. I have traveled quite a bit
in the IT world. Don't be surprised if it is more than what you've seen.
(Did you forget that earlier you accused me of using characters outside
ISO-8859-1 while claiming I am within it? That is because you saw IAST and
PTS displayed. They use those wonderful letters symbols and diacritics you
are trying to tout. Is there a problem with Asians using ISO-8859-1 code
space even for transliteration?

> The bonus will be that you can still write the Sinhalese
> language with a romanisation like yours,


but there's no need to
> reinvent the Sinhalese script

Singhala script existed many, many years since before the English and
French adopted Latin. What I did was saving it from the massacre going on
with Unicode Sinhala.

itself that your encoding is not even
> capable of completely support in all its aspects (your system only
> supports a reduces subset of the script).
What is the basis for this nonsense?. (Little birds whispering in the
background. Watch out. They are laughing).
My solution supports the entire script, Singhala, Pali and Sanskrit plus
two rare allophones of Sanskrit as well. Tell me what it lacks and I will
add it, haha! One time you said I assigned Unicode Sinhala characters to
the 'hack' font. What I do is assigning Latin characters to Singhala
phonemes. That is called transliteration. There are no 'contextual
versions' of the same Singhala letters like you said earlier.

Ask your friends what they have more than mine in the Singhala script. Ask
them why they included only two ligatures when there are 15 such. Ask them
how many Singhala letters there are.

> Even the legacy ISCII system (used in India) is better, because it is
> supported by a published open standard, for which there's a clear and
> stable conversion from/to Unicode.
My solution is supported by two standards: ISO-8859-1 and Open Type.
ISO-8859-1 is Basic Latin plus Latin-1 Extension part of Unicode standard.

Bottom line is this: If Latin-1 is good enough for English and French, it
is good enough for Singhala too. And if Open Type is good for English and
French, it is good for Singhala too.

> 2012/7/5 Naena Guru <>:
> > Philippe,
> >
> > My last message was partial. It went out by mistake. I'll try again. It
> > takes very long for this old man.
> >
> >
> > ---------- Forwarded message ----------
> > From: Naena Guru <>
> > Date: Wed, Jul 4, 2012 at 10:32 PM
> > Subject: Re: Romanized Singhala - Think about it again
> > To:
> >
> >
> > Hi, Philippe. Thanks for keeping engaged in the discussion. Too little
> time
> > spent could lead to misunderstanding.
> >
> >
> > On Wed, Jul 4, 2012 at 3:42 PM, Philippe Verdy <>
> wrote:
> >>
> >> 2012/7/4 Naena Guru <>:
> >> > Philippe Verdy, obviously has spent a lot of time
> >>
> >> Not a lot of time... Sorry.
> >>
> >> > researching the web site
> >> > and even went as far as to check the faults of the web service
> provider,
> >> >
> >>
> >> I did not even note that your hosting provider was that company. I
> >> just looked at the HTTP headers to look at the MIME type and charset
> >> declarations. Nothing else.
> >
> > I know that the browser tells it. It is not a big deal, WOFF is the
> > compressed TTF, but TTF gets delivered. If and when GoDaddy fixes their
> > problem, the pages get delivered faster. Or I can make that fix in a
> > .htaccess file. No time!
> >>
> >>
> >> > He called my font a hack font without any proof of it.
> >>
> >> It is really a hack. Your font assigns Sinhalese characters to Latin
> >> letters (or some punctuations) of ISO 8859-1.
> >
> > My font does not have anything to do with Singhalese characters if you
> mean
> > Unicode characters. You are very confusing.
> > A Character in this context is a datatype. In the 80s it was one byte in
> > size and used to signal not to use in arithmetic. (We still did it to
> > convert between Capitals and Simple forms.) In the Unicode character
> > database, a character is a numerical position. A Unicode Sinhala
> character
> > is defined in Hex [0D80 - 0DFF]. Unicode Sinhala characters represent an
> > incomplete hotchpotch of ideas of letters, ligatures and signs. I have
> none
> > of that in the font.
> >
> > I say and know that Unicode Sinhala is a failure. It inhibits use of
> > Singhala on the computer and the network. I do not concern me with
> fixing it
> > because it cannot be fixed. Only thing I did in relation to it is to
> write
> > an elaborate set of routines to *translate* (not map) between constructs
> of
> > Unicode Sinhala characters and romanized Singhala. That is not in the
> font.
> > The font has lookup tables.
> >
> >> It also assigns
> >> contextual variants of the same abstract Sinhalese letters, to ISO
> >> 8859-1 codes,
> >
> > What contexts cause what variants? Looks like you are saying Singhala
> > letters cha
> >>
> >> plus glyphs for some ligatures of multiple Sinhalese
> >> letters to ISO 8859-1 codes, plus it reorders these glyphs so that
> >> they no longer match the Sinhalese logicial order.
> >
> >
> > [assigns] ligatures of multiple Sinhalese letters to ISO 8859-1 codes
> > What is Singhalese logical order?
> >
> >>
> >>
> >> Yes this font is a hack because it pretends to be ISO 8859-1 when it
> >> is not. It is a specific distinct encoding which is neither ISO 859-1
> >> and neither Unicode, but something that exists in NO existing
> >> standard.
> >>
> >> > It has
> >> > only characters relevant to romanized Singhala within the SBCS. Most
> of
> >> > the
> >> > work was in the PUA and Look-up Tables. I am reminded of Inspector
> >> > Clouseau
> >> > that has many gadgets and in the end finds himself as the culprit.
> >>
> >> And you have invented a Inspector Guru gadget for your private use on
> >> your site, instead of developping a TRUE separate encoding that you
> >> SHOULD NOT name "ISO 8859-1". Try to do that, but be aware that the
> >> ISO registry of 8-bit encodings is now frozen. You'll have to convince
> >> the IANA registry to register your new encoding. For now it is
> >> registered nowhere. This is a purely local creation for your site.
> >>
> >> > I will still read and try those other things Philippe suggests, when I
> >> > get
> >> > time. What is important for me is to improve on orthography rules and
> >> > add
> >> > more Indic languages -- Devanagari and Tamil coming up.
> >> >
> >> > As for those who do not want to think rationally and think Unicode is
> a
> >> > religion,
> >>
> >> No. Unicode is a technical solution for a long problem :
> >> interoperability of standards using open technologies. Given that you
> >> do not want to even develop your own encoding as a registered open
> >> standard compatible with a lot of applications (remember that all new
> >> web standards MUST now support Unicode in at least one of its standard
> >> UTF, you're just loosing time here)
> >>
> >> > I can only point to my dilemma:
> >> >
> >> >
> >> > Have a Happy Fourth of July!
> >>
> >> Next time don't cite me personnaly trying to conveince others that I
> >> have supported or said something I did not write myself. You have
> >> interpreted my words at your convenience, but I don't want to be
> >> associated nominatively and publicly with your personnal
> >> interpretations. Even if I also have my own opinions, I don't want to
> >> cite anyone else's opinions without just quoting his own sentences
> >> (provided that these sentences were public or that I was authorized by
> >> him to quote his sentences in other contexts).
> >>
> >> Stop this abuse of personalities. Thanks.
> >
> >
> >
Received on Thu Jul 05 2012 - 03:05:45 CDT

This archive was generated by hypermail 2.2.0 : Thu Jul 05 2012 - 03:05:47 CDT