Re: Romanized Singhala - Think about it again

From: Jean-François Colson <>
Date: Thu, 05 Jul 2012 14:35:04 +0200

Le 05/07/12 10:02, Naena Guru a écrit :
> On Wed, Jul 4, 2012 at 11:33 PM, Philippe Verdy <
> <>> wrote:
> Anyway, consider the solutions already proposed in Sinhalese
> Wikipedia. There are verious solutions proposed, including several
> input methods supported there. But the purpose of these solutions is
> always to generate Sinhalese texts perfectly encoded with Unicode and
> nothing else.
> Thank you for the kind suggestion. The problem is Unicode Sinhala does
> not perfectly support Singhala!
What's wrong? Are there missing letters?

> *The solution is for Sinhala not for Unicode!*
Or rather for Sinhala by Unicode.

> **I am not saying Unicode has a bad intention but an ill-conceived
> product.
What precisely is ill-conceived?

> The fault is with Lankan technocrats that took the proposal as it was
> given and ever since prevented public participation. My solution is
> 'perfectly encoded with Unicode'.
No. It's an 8-bit character set independant from Unicode.

> Yes thee may remain some issues with older OSes that have limited
> support for standard OpenType layout tables. But there's now no
> problem at all since Windows XP SP2. Windows 7 has the full support,
> and for those users that have still not upgraded from Windows XP,
> Windows 8 will be ready in next August with an upgrade cost of about
> US$ 40 in US (valid offer currently advertized for all users upgrading
> from XP or later), and certainly even less for users in India and Sri
> Lanka.
> The above are not any of my complaints.
> Per Capita Income in Sri Lanka $2400. They are content with cell
> phones. The practical place for computers is the Internet Cafe. Linux
> is what the vast majority needs.
> And standard Unicode fonts with free licences are already available
> for all systems (not just Linux for which they were initially
> developed);
> Yes, only 4 rickety ones. Who is going to buy them anyway?
Why would you buy them if they're free?

> Still Iskoola Pota made by Microsoft by copying a printed font is the
> best. You check the Plain Text by mixing Singhala and Latin in the
> Arial Unicode MS font to see how pretty Plain text looks. They spent
> $2 or 20 million for someone to come and teach them how to make fonts.
> (Search Staying friendly with them is profitable. World bank
> backs you up too.
> Sometime in 1990s when I was in Lanka, I tried to select a PC for my
> printer brother. We wanted to buy Adobe, Quark Express etc. The store
> keeper gave a list and asked us to select the programs. Knowing that
> they are expensive, I asked him first to tell me how much they cost.
> He said that he will install anything we wanted for free! The same
> trip coming back, in Zurich, the guys tried to give me a illicit copy
> of Windows OS in appreciation for installing German and Italian (or
> French?) code pages on their computers.
> there even exists solutions for older versions of iPhone
> 4. OR on Android smartphones and tablets.
> Mine works in them with no special solution. It works anywhere that
> supports Open Type -- no platform discrimination
Is there any platform discrimination with Unicode Sinhala?

> No one wants to get back to the situation that existed in the 1980's
> when there was a proliferation of non-interoperable 8 bit encodings
> for each specific platform.
> I agree. Today, 14 languages, including English, French, German and
> Italian all share the same character space called ISO-8859-1.
In fact, ISO-8859-1 is not well suited for French (my native language):
it lacks a few letters which were added to ISO-8859-15. However, I
always use Unicode today, even for French-only texts.

> Romanized Singhala uses the same. So, what's the fuss about? The font?
The problem is that only your translitteration scheme, with Latin
letters, is supported by ISO-8859-1, not the Sinhalese letters themselves.

> Consider that as the oft suggested IME. Haha!
> And your solution also does not work in multilingual contexts;
> If mine does not work in some multilingual context, none of the 14
> languages I mentioned above including English and French don't either.
They do because they use Latin letters, not Sinhalese letters.

> it does
> not work with many protocols or i18n libraries for applications.
> i18n is for multi-byte characters. Mine are single-byte characters.
OK. Do it as you want, but it won't be Unicode compliant.

> As you see, the safest place is SBCS.
I don't see. Why is it safer?

> Or it
> requires specific constraints on web pages requiring complex styling
> everywhere to switch fonts.
> Did you see
> <>? May be you are confusing Unicode
> Sinhala and romanized Singhala. Unicode Sinhala has a myriad such
> problems.
Which problems?

> That is why it should be abandoned!
Why wouldn't you try to solve the problems, whatever they could be,
instead of proposing an entirely new character set nobody will support?
If the rendering engines don't work as you expect they should, how a new
encoding scheme could solve the problem?

> Please look at the web site and say it more coherently, if I
> misunderstood you.
> Plain text searches in mutliingual pages
> won't work. Usability tools won't work.
> Have you tried to search a vowel in Unicode Sinhala? Romanized
> Singhala has no search problem. Try it in the my web site.
Well, perhaps there're problems with search engines. Wouldn't it be
possible to correct search engines instead of inventing a new character set?

> Really consider abandonning the hacked encoding of the Sinhalese
> script itself.
> There is no re-encoding of Singhala. Singhala is transcribed into
> Latin! When I say Singhala, I don't mean Unicode Sinhala. It is the
> Singhala phoneme inventory that was transliterated.
Using Latin letters for a transliteration of Sinhala is not a hack, but
making fonts said to be Latin-1 with Sinhalese letters instead of the
Latin letters is a hack.

> It will however be more valuable if you just
> concentrate on creating a simpler romanization system. that will use
> standard Unicode encoding of Latin
> This is exactly what I did. Have I been talking to someone who did not
> know what he was evaluating?
I think he was speaking of the translitteration, not of your hack.

> (note that you are absolutely not
> limited to the reduced ISO 8859-1 subset for Latin and that there's
> already a much richer set of letters, symbols and diacritics for all
> needs ; but here again this requires using Unicode and not just ISO
> 8859-1).
> Oh, thank you for the generosity of allowing me use of the entire
> Latin repertoire. You don't have to tell that to me. I have traveled
> quite a bit in the IT world. Don't be surprised if it is more than
> what you've seen. (Did you forget that earlier you accused me of using
> characters outside ISO-8859-1 while claiming I am within it? That
> is because you saw IAST and PTS displayed. They use those wonderful
> letters symbols and diacritics you are trying to tout. Is there a
> problem with Asians using ISO-8859-1 code space even for transliteration?
> The bonus will be that you can still write the Sinhalese
> language with a romanisation like yours,
> Bonus?
> but there's no need to
> reinvent the Sinhalese script
> Singhala script existed many, many years since before the English and
> French adopted Latin.
Did any body say it didn't?

> What I did was saving it from the massacre going on with Unicode Sinhala.
Which massacre? What's wrong with the Unicode support of Sinhala? Could
you give details, please?

> itself that your encoding is not even
> capable of completely support in all its aspects (your system only
> supports a reduces subset of the script).
> What is the basis for this nonsense?. (Little birds whispering in the
> background. Watch out. They are laughing).
> My solution supports the entire script, Singhala, Pali and Sanskrit
> plus two rare allophones of Sanskrit as well. Tell me what it lacks
> and I will add it, haha! One time you said I assigned Unicode Sinhala
> characters to the 'hack' font. What I do is assigning Latin characters
> to Singhala phonemes. That is called transliteration. There are no
> 'contextual versions' of the same Singhala letters like you said earlier.
> Ask your friends what they have more than mine in the Singhala script.
> Ask them why they included only two ligatures when there are 15 such.
Can't you make a proposal or describe the missing letters?

> Ask them how many Singhala letters there are.
> Even the legacy ISCII system (used in India) is better, because it is
> supported by a published open standard, for which there's a clear and
> stable conversion from/to Unicode.
> My solution is supported by two standards: ISO-8859-1 and Open Type.
> ISO-8859-1 is Basic Latin plus Latin-1 Extension part of Unicode standard.
It is not supported by ISO-8859-1. ISO-8859-1 isfor Latin letters, not
Sinhalese ones.

> Bottom line is this: If Latin-1 is good enough for English and French,
> it is good enough for Singhala too.
No, because Sinhala is not written with Latin letters.

> And if Open Type is good for English and French, it is good for
> Singhala too.
Of course.
Received on Thu Jul 05 2012 - 07:38:09 CDT

This archive was generated by hypermail 2.2.0 : Thu Jul 05 2012 - 07:38:10 CDT