Re: Romanized Singhala - Think about it again from Jean-François Colson on 2012-07-17 (Unicode Mail List Archive)

From: Jean-François Colson <jf_at_colson.eu>
Date: Tue, 17 Jul 2012 20:11:55 +0200

Le 17/07/12 02:43, Naena Guru a écrit :
>> Just see the daily questions and dedicated section for Indic at
Unicode.org, and think why ordinary people Anglicize instead of using
Unicode Sinhala. (e.g. elakiri.com).
> Some also use the Sinhalese script.
> I’ve sometimes seen people type in Arabic with Latin letters in a
French Library, because the computers they used only had French
keyboards and they didn’t know an Arabic keyboard enough to touch type
in Arabic with Arabic letters.
>
> That's right. Everyone is familiar with the good old QWERTY keyboard.
The Singhalese have developed their own Anglicizing convention. The
Tamils do it too, but their Anglicizing is different from the one the
Singhalese use. They are little more respectful of their language and
try to Anglicize more precisely.
>
> I used the Singhala typewriter in late 60s. The gayanna was where you
get period on QWERTY. It is entirely different from the layout of the
English one with dead keys for parts of letters. This is what Unicode
Sinhala inherited. It is many fold easier if Singhala follows closely
with English layout. I made one for Unicode. The best I could get still
needed three-finger keys. Besides, even after you enter ZWJ and do not
get the desired conjoints because the font does not have them.
Typing and encoding are two different matters.
If present Sinhalese fonts don’t do the job, you can improve them.

You can develop a hundred keyboard layouts and input methods to type the
same text in a hundred different ways.

Aren’t there any keyboards with the Sinhalese letters drown on the keytops?

If there aren’t and you think the present Sinhalese keyboard layout
doesn’t fit the QWERTY layout well enough, feel free to design a new
layout and distribute drivers for the main operating systems.

>
>
>>
>> It's a colossal failure!
> Really?
>
> Of course, I don't have to repeat. You have read what I said.
I have.

>
>
>> The people Anglicize than using Unicode Sinhala.
> What do you mean? If they transliterate, that’s not really
anglicization.
>
> You get a glimpse of the light. Anglicizing is trying to use English
writing conventions to write Singhala. Anglicizing is not a complete
mapping, transliteration is. Singhala has 58 phonemes including 10
digraphs used for Sanskrit and Pali (aspirates). The English alphabet is
not enough even for English. It has discarded þorn, eð, æsc etc.. So, it
has digraphs. Then because of the capitalizing convention it makes its
set of letters even fewer.
>
>
>
>> To be fair, the Lankan technocrats did not have a clue when they
were asked to approve the standard.
> I know that problem. The same occured for French with Latin-1.
That’s why some French letters are missing in Latin-1.
>
> Tell me about it.
Latin-1 (ISO-8859-1) lacks the French letter Œ/œ and the capital Ÿ.
Œ is used in a number of common words such as cœur (heart), œil (eye),
œsophage (oesophagus), Œdipe (Oedipus), œuf (egg), etc.
Ÿ is used in a few toponyms such as L’Haÿ-les-Roses, a commune near
Paris, which can be capitalized as L’HAŸ-LES-ROSES.
It also lacks the apostrophe ’.

Those characters were added in Latin-9 (ISO-8859-15)
Œ = 0xBC, œ = 0xBD, Ÿ = 0xBE
and in CP1252
Œ = 0x8C, œ = 0x9C, Ÿ = 0x9F, ’ = 0x92

Of course, AFAICT, they were part of the first release of Unicode.

> It is first come, first serve.
It is.

> Isn't language, and therefore, the writing a (if not the) major part
of a culture?
You’re right.

>
>
>> It was a time when there was (perhaps even now) a typist in the
corner of the office of the bureaucrat. The big guys do not know
touch-typing even now. Proof: A university professor wrote me a harangue
using cyber-sex orthography (no capitals) accusing me for working for
Americans. I had suggested that Unicode is a conspiracy to confuse us.
(That is a bit way over, no such motive, nevertheless the effect is the
same)
>>
>>
>>
>>> Romanized Singhala uses the same. So, what's the fuss
about? The font?
>>
> The fact that your encoding won’t be supported on many computers
worldwide.
>
> Jean, for the umpteenth time, I am not encoding anything. It is a
transliteration. It is using a different script (Latin) than what you
use traditionally (Singhala):
> Not සිංහල අකුරු, but 'síhala akuru'.
OK. Do you display Sinhala with Latin letters?
If you do, that’s not a problem.
If you display it with Sinhalese letters, you’ll need to change the font
whenever you want to write in another script.
Just imagine a Sinhala/English dictionary. How many font changes would
you make for such a book?
That’s a big step backwards.

> අ -> a
> එ -> e
> ක් -> k
> අං -> á
> ඤ් -> ç
> ශ් -> z
> etc. . .
>
> About the font that unnerves you:
> Think of 3D cinema. If you wear the 3D glasses, you see clearly. The
font is for the user's benefit. The web masters can give the option I
gave on my site to keep happy those who dislike (warning: I must select
mild adjectives to honor sensitivities of some) seeing the Singhala
letters. I have the feeling that you did not see the web site:
> http://www.lovatasinhala.com
> Hit the link on RHS that says Latin Script
I did see it. It uses an 8-bit embedded font to display Sinhala. But if
I copy and paste some part of the text in my fovourite word processor, I
see Latin letters instead. That’s the problem.
Also, a speech synthesis program or a braille converter would need a
transcoding table to understand your texts.

>
>
>> The problem is that only your translitteration scheme, with
Latin letters, is supported by ISO-8859-1, not the Sinhalese letters
themselves.
>>
>> You are right partially. I do not need permission from anyone to
use any font.
>>
>> Jean, the computer thinks it is ISO-8859-1. ISO 8859-1 is only a
set of numbers!
> No. It also is a table which makes relations between those
numbers and a well defined set of characters, mainly Latin letters.
>
> Character is a data type. Letter is an abstract idea of a shape used
in a script. It is not exact. Did you get perturbed by seeing two
entirely different simple letter a or g?
No.

> It is an idea that we do not think about again. This is another a,
this is another g. That is it. We go not ask who did this different one
trying to confuse us.
But ක් is not a Latin k, it’s a Sinhalese ka.
Print a k using any of the thousands of Latin fonts available and ask
what it is to any speaker (who can read) of any language written with
the Latin alphabet. It is most likely s/he will recognize the k.
Print a Sinhalese ක් and do the same experiment. I bet s/he will not
recognize a Latin k in it.
Why?
Because that’s not the same script.
Of course, there is a big amount of variation allowed in the design of
Latin letters, such as the proportions, the “slantedness”, the use of
serifs, the difference of thickness following the orientation of the
stroke, the boldness, the “italicness”, illumination, and many others
details.
Of course, a few letters accept different shapes.
But there are limits in the variation. As soon as nobody, among the
users of languages written with the Latin alphabet, doesn’t recognise
any Latin letters, you can say for sure those aren’t Latin letters.

>
>
>> [128-255]. Don't get stuck with the names of the codepoints.
The stupid computer cannot read their names. What travels the network
are the bytes in there bare form.
> OK. But your 8-bit encoding won’t be ISO-8859-1 a.k.a. Latin-1.
>
> Semantics game.
YOU spoke about ISO-8859-1 which is an 8-bit character set (or an SBCS,
Single Bite Character Set). Although that’s not a general rule, most
often the bytes are made of 8 bits. ISO-8859-1 defines characters and
control codes with numbers from 0x00 to 0xFF.

When I look at the source code of http://www.lovatasinhala.com, I read:
<meta http-equiv="Content-Type"
content="text/html;Charset=windows-1252">
windows-1252, a.k.a. CP1252, is another SBCS. That’s clearly a hack
because you say it is in CP1252 while it isn’t.

> Let it be 16-bit then. Still it works!
Yes, it works, but it is neither Latin-1 nor CP1252.

>
>
>
>> When they are viewed, the user has the choice (theoretically) to
select the font.
> On most browsers it is easy to change the encoding. Changing the
font is less evident.
>
> Well, the browser selects it for you according to your computer's
(i.e. your) preferences. 'font-family: sans-serif' tells the browser to
select a font that matches that calling. 'font-family: samagana,
sans-serif' would make the browser to look for the existence of
'samagana' before falling back to a serif-less font.
>
> You said it is easy to change the encoding. Show me.
http://colson.eu/lovatasinhalagrec.png

>
>
>>
>> Stop your imagination and do this. Go to this site:
>> http://www.lovatasinhala.com
>> On the right-hand-side column (directly below the lion), there
is a link in a light-blue box that says, "Latin Script". Click on that
and get rid of the dreaded Singhala script and be happy. What you see is
not Icelandic. It is romanized Singhala. And if you want to really read
it, click on the next link below and see the pronunciation key.
>>
>> I have requested a fellow to translate at least the page on
Unicode to English. Hopefully, he does it quick.
>>
>>
>>> Consider that as the oft suggested IME. Haha!
>>>
>>>
>>> And your solution also does not work in multilingual
contexts;
>>>
>>> If mine does not work in some multilingual context, none of
the 14 languages I mentioned above including English and French don't
either.
>> They do because they use Latin letters, not Sinhalese letters.
>>
>> English, French and romanized Singhala do not work on
multilingual contexts.You are confusing letters and codepoints. Letters
are provided by FONTS in the user interface in the LOCAL device.
>>
>> By the way, how do you localize in France?
> On the Linux computer I’m using right now, I use a UTF-8 locale
(fr_BE.UTF-8). And I’m not in France. Other common locales are Latin-1,
and Latin-9. They are disappearing, but that’s a long process.
>
> My Linux has no such problems because it stays at default US. That is
the *sweet-spot* for small communities like Singhalese whose elite are
arrogant, lazy self-serving and indifferent. It is best for them to use
something that least involves the heavy hand of the powerful.
>
>
>
>> Do you know that the English writing was romanized when the
English people were forced into Christianity?
> Yes.
>
> Good. Have a little mercy for the Singhalese.
>
>
>
>> The only truly surviving English letter is þorn (þ).
> It comes from ᚦ. But there’s also some similarity between ᚠ and
F, ᚱ and R, ᚳ and K, ᚻ and H, ᛁ and I, ᛋ and S (with a rotation), ᛏ and
T, ᛒ and B, ᛗ and M, so Þ is not really the only surviving Futhorc letter.
>
> You are showing lot of squares, again demonstrating how bad it gets
when you leave the cozy Latin-1 corner. Latin script was not derived
from fuþorc.
I didn’t say that. Þ (thorn) comes from Elder Futhark. Runes and the
Latin alphabet have a common ancestor. That’s all.

> It came for the Latin set which is all capitals and only 22.
Not from odl Italic?

> Again þorn is the only letter we have inside Latin. æsc, eð etc. are
improvisations. They ran from the 'pagan' runes.
>
>
>
>>
>>
>>>
>>> it does
>>> not work with many protocols or i18n libraries for
applications.
>>>
>>> i18n is for multi-byte characters. Mine are single-byte
characters.
>> OK. Do it as you want, but it won’t be Unicode compliant.
>>
>> Thank you for your generosity, sire. I waited all this long for
it. (I am kidding).
>>
>>
>>
>>> As you see, the safest place is SBCS.
>> I don’t see. Why is it safer?
>>
>> Just compare romanized Singhala and Unicode Sinhala.
>> First, the display of the script is not guaranteed. You get
Character-not-found rows if you do not have the font.
> And you get Latin letters if you don’t have your special font.
>
> Yes! But the beauty of it is that you get readable romanized
Singhala, not garbage.
On most operating systems, if the current font doesn’t have support for
Sinhala, a fallback mechanism will automatically display the Sinhalese
text with a Sinhalese font. That’s the beauty of Unicode. Of course, in
that aged Windows XP, there are many things which don’t work as expected.

>
>
>> Then you see garbage with letters and signs mixed up if you did
not update your font renderer (e.g. uniscribe).
> Is your font independent from Uniscribe for Windows users?
> Where can I download your font?
>
> It was made as an Open Type font. That is it. First it would work
only in WorldPad from SIL.org.
It also works with Firefox AFAICT.

> Then as Uniscribe was developed, it started to work inside Windows
Notepad. When Uniscribe was fully developed, I asked MS VOLT group why
Notepad supports Open Type and not Word 2003 (including Calibri). They
said it is owing to a business decision, perhaps reserving it for high
end apps. When I asked if by 'high-end' they mean Publisher, they did
not answer. Now it works inside Office 2010. The Mac and Adobe Suite
supported it ever since I tested it in them (2004). Even now, Apple
Safari is the best browser for it. IE is the worse. In 2006 Firefox
supported it with my setting CSS text-rendering to geometricPrecision.
This was because someone said that my pages with thousands of ligatures
would grind the computer to a halt -- false assumption. Ever since I
kept that CSS directive there, though Webkit does not care. Chromium on
Linux does not form the letters, meaning no Open Type support. So, use
Firefox in your system. iPhone: Yes, Android: No.
>
> The font is only a proof-of-concept and it does not impose a specific
orthography.
> Go to:
> http://www.lovatasinhala.com/liyanna.php#unisin
> Click on the third bulleted item
Do you mean “veegavaþ síhala muðraa puvaruva (rapid Sinhala keyboard)”?
That’s a link to keyboard drivers, not to the OpenType font.

>
>
>> (Only Windows 7 comes with latest Uniscribe). Different fonts
have different levels of letter construction, and some have wrong
letters for wrong codepoints.
> Is that a problem with Unicode or a problem with the font perhaps
made by an incompetent person?
>
> Not a Unicode problem. The guys in Lanka did not update Uniscribe on
XP or Vista with cooperation with Microsoft. (Zzzzzzz). Yes. Lack of
knowledge about the orthogtraphy among font makers. In addition, they
maul the language on advice of SLS1134. The Unicode Sinhala font Apple
uses has wrong shapes
>
>
>
>> This is how it is in iPhone.
>>
>> When you transport Romanized Singhala, you do not need to
re-encode it (e.g. UTF-8) for the purpose and bloat it. There is not
even an HTML editor for it. You need to re-write all well established
and seasoned applications using updated compilers that added
wide-character functions.
>>
>> Here is a test for you. The following is a Unicode Sinhala
paragraph (a random copy from the web site http://divaina.com/ news web
site (Sunday issue). Your computer must be Plain Text ready for this. I
bet it is not.
>>
>> ළමයින්ගෙ අධ්‍යා පනය කඩාකප්පල් වෙනව තමයි. ඒත් ඉතිං මොකද කරන්නෙ? රටේ ආණ්‌ඩුවට
පණිවිඩයක්‌ දෙන්න ස්‌ට්‍රයික්‌ නැතුව බැරි වීම අවාසනාවන්ත තත්ත්ව යක්‌. මේ ප්‍රශ්න සමූහය දැන්
තීරණාත්මක තැනකට ඇවිත් තියෙනව.
>>
>> 1 Copy it to Notepad
> That’s a Windows software. Can I use Gedit instead?
>
> Sure.
>
>
>
>> 2. From Notepad, copy it to a new MS Word page
> That’s a Windows software. Can I use LibreOffice Writer instead?
>
> May be. But...
> I am trying to demonstrate a strange bug in Word. Second, if you want
to test *my* font, use something like Abiword. Plain text editors can't
be expected to have support of font rendering engines, except, of
course, Notepad which was employed to test Uniscribe.
>
>
>
>
>
>
>> 3. Copy what you pasted into Word back to Notepad below the original
>> 4 Copy that second one from Notepad back to Word below the one
it already has
>>
>> Observe that MS Word altered the codepoints in the underlying
text runs.
> Could you make a few screenshots?
>
> Sorry, no. It's about Unicode Sinhala and MS Word.
>
>
>
>>
>>>
>>> Or it
>>> requires specific constraints on web pages requiring
complex styling
>>> everywhere to switch fonts.
>>>
>>> Did you see http://www.lovatasinhala.com? May be you are
confusing Unicode Sinhala and romanized Singhala. Unicode Sinhala has a
myriad such problems.
>> Which problems?
>>
>> See above including the test.
> A test I can’t reproduce because I don’t have a copy of M$ Word.
>
> Too bad. I agree with M$.
>
>
>
>>
>>
>>> That is why it should be abandoned!
>> Why wouldn’t you try to solve the problems, whatever they
could be, instead of proposing an entirely new character set nobody will
support?
>>
>> There are only two solutions. ONE: Completely redefine the
Singhala code block .
> Impossible. What is already encoded cannot be changed. But new
characters could be added.
>
> Right. Therefore, the most practical solution is to use the best
solution that Singhala has anyway. That is to use the transliteration
and use conversion between the two, which I have already done.
>
>
>
>> TWO: Just abandon it and use the transliteration. Why go through
the trouble to satisfy fellows like you who do not use Singhala anyway?
> Do it if you like. I’m not sure your fellow countrymen will
follow you.
>
> True. You can take the horse to the water but can't make it drink:
> Turn on the volume and enjoy: http://www.lovatasinhala.com/assayaa.htm
>
>
>
>
>
>>
>> If the rendering engines don’t work as you expect they
should, how a new encoding scheme could solve the problem?
>>
>> The rendering engine works just fine! It is the code block that
is sick.
> It is not impossible it needs a treatment.
>
> Wishful thinking on your part.
>
>
>
>> You are way off base, buddy.
> Really?
>
> Absolutely.
>
>
>
>>
>>
>>> Please look at the web site and say it more coherently, if
I misunderstood you.
>>>
>>>
>>> Plain text searches in mutliingual pages
>>> won't work. Usability tools won't work.
>>>
>>> Have you tried to search a vowel in Unicode Sinhala?
Romanized Singhala has no search problem. Try it in the my web site.
>> Well, perhaps there’re problems with search engines.
>>
>> Haha! I am not talking about search engines. I am talking about
text processing.
> What’s the difference? Search engines like Google look for a
string of characters on the WWW, to say it simply, while your favorite
word processor’s search function looks for it in a single document, but
that’s still a piece of software to update.
>
> Yeah, yeah, lame, lame...
>
>
>
>> I am sorry but talking to you is like the Singhala saying,
"biiri aliyaata veenaa gahanavaa vagee." -- Like playing the violin for
the deaf elephant.
> Isn’t the player deaf too?
>
> Possible.
>
>
>>
>> Wouldn’t it be possible to correct search engines instead of
inventing a new character set?
>>
>> You need to go back to school. There is no new character set. A
Unicode character is just a numeric code Unicode character database
goes from zero to some very big number. There are no holes in it to
define character sets for somebody's fancy. Well, Doug Ewell did one for
Esparanto expanding fuþorc. We need to do something practical, and I did
it already.
> When you use a new 8-bit encoding for your Sinhalese font, that
is a new character set. And it has nothing to do with Unicode. No need
to go back to school to understand it.
>
> Good.
>
>
>
>>
>>
>>>
>>> Really consider abandonning the hacked encoding of the
Sinhalese
>>> script itself.
>>>
>>> There is no re-encoding of Singhala. Singhala is
transcribed into Latin! When I say Singhala, I don't mean Unicode
Sinhala. It is the Singhala phoneme inventory that was transliterated.
>> Using Latin letters for a transliteration of Sinhala is not
a hack, but making fonts said to be Latin-1 with Sinhalese letters
instead of the Latin letters is a hack.
>>
>> Well, you can characterize the smartfont solution anyway you
like. The problem for you is that it works!
> That’s not a problem for me. Note that I can’t copy and paste it
to a text editor.
>
> You didn't try.
I did! And I got Latin letters instead of the Sinhalese ones.

>
>
>>
>> Sorry for this Kindergarten lesson, but you should understand
the role of the font.
>
Do you have a link to the font? http://%e2%80�

>> A font is a support application at the User Interface level. It
is what the user decides to use to see underlying text runs in an
application's view port. The same text one person reads at the computer
in Arial others read in Helvetica. In the same manner, if I did not
deliver the font with the web page, you will see it in some sans-serif
font your computer has.
> But I’ll see Latin letters instead of Sinhalese ones.
>
> Yes! Instead of garbage, you see romanized Singhala anywhere, any
time, any device!
If it was Unicode Sinhala, I’d see it too.

> Andif you have the font in the system, jaya, jaya, jaya!
But I don’t have the font. Where can I find it?

>
>
>> It is something that happens locally in the device. When text
moves between applications and between computers, they travel as numeric
codes representing the text in the form of digital bytes. The computer
can't say French from Singhala.
>
But many properties are assigned to each code-point by Unicode. For
example, if you type a lowercase letter at the beginning of a sentence,
it is possible that the letter be capitalized. For example, æ could be
replaced by Æ to which it seems you have assigned no Sinhalese letter. g
could be replaced by G. Did you assign the same Sinhalese letter to both
g and G?

>>
>>
>>>
>>> It will however be more valuable if you just
>>> concentrate on creating a simpler romanization system.
that will use
>>> standard Unicode encoding of Latin
>>>
>>> This is exactly what I did. Have I been talking to someone
who did not know what he was evaluating?
>> I think he was speaking of the translitteration, not of your
hack.
>>
>> I hope the fellow reads the above response. I wish you guys
lived close by here in US so that I could hold a special class to teach
you how computers function.
> I live in Belgium. What would you teach me?
>
> Well, come on over!
>
>
>
>>
>>
>>>
>>> (note that you are absolutely not
>>> limited to the reduced ISO 8859-1 subset for Latin and
that there's
>>> already a much richer set of letters, symbols and
diacritics for all
>>> needs ; but here again this requires using Unicode and
not just ISO
>>> 8859-1).
>>>
>>> Oh, thank you for the generosity of allowing me use of the
entire Latin repertoire. You don't have to tell that to me. I have
traveled quite a bit in the IT world. Don't be surprised if it is more
than what you've seen. (Did you forget that earlier you accused me of
using characters outside ISO-8859-1 while claiming I am within it? That
is because you saw IAST and PTS displayed. They use those wonderful
letters symbols and diacritics you are trying to tout. Is there a
problem with Asians using ISO-8859-1 code space even for transliteration?
>>>
>>>
>>> The bonus will be that you can still write the Sinhalese
>>> language with a romanisation like yours,
>>>
>>> Bonus?
>>>
>>> but there's no need to
>>> reinvent the Sinhalese script
>>>
>>> Singhala script existed many, many years since before the
English and French adopted Latin.
>> Did any body say it didn’t?
>>
>> He said reinvent the Singhala SCRIPT. The script is the script.
I use the same script in a more complete and correct manner than any
Unicode font even with my incomplete, rough design, proof-of-concept font.
>>
>>
>>
>>> What I did was saving it from the massacre going on with
Unicode Sinhala.
>> Which massacre? What’s wrong with the Unicode support of
Sinhala? Could you give details, please?
>>
>> I gave the details earlier in this response
> Not enough details
>
> Too bad.
>
> .
>
>
>>
>>
>>>
>>> itself that your encoding is not even
>>> capable of completely support in all its aspects (your
system only
>>> supports a reduces subset of the script).
>>>
>>> What is the basis for this nonsense?. (Little birds
whispering in the background. Watch out. They are laughing).
>>> My solution supports the entire script, Singhala, Pali and
Sanskrit plus two rare allophones of Sanskrit as well. Tell me what it
lacks and I will add it, haha! One time you said I assigned Unicode
Sinhala characters to the 'hack' font. What I do is assigning Latin
characters to Singhala phonemes. That is called transliteration. There
are no 'contextual versions' of the same Singhala letters like you said
earlier.
>>>
>>> Ask your friends what they have more than mine in the
Singhala script. Ask them why they included only two ligatures when
there are 15 such.
>> Can’t you make a proposal or describe the missing letters?
>>
>> Let it rot in place. (Lankan government might need it to get
loans from WB to feed the IT guys over there). I proved that it is not
necessary. Romanizing takes care of it and the native readers can use
the orthographic font if they want. Otherwise, they can use Latin script
just like you and I do here. Remember that the font is a local decision.
It need not go out of your computer and cause heart ache among people
like you. The following is the first sentence at:
>> http://www.lovatasinhala.com/liyanna.php
>> oba kiyavana ðeruva heøa kramaya viðyaaþmakava haa
vyaakaraµaanukuulava saðaa æþi nisaa, eya batahira yuroopiiya bhaaxaa
parigaµakaya þula labana varaprasaaða elesama síhalataþ labaa ðeyi.
>>
>> I suggest you get with it and move on.
>>
>>
>>
>>> Ask them how many Singhala letters there are.
>>>
>>>
>>> Even the legacy ISCII system (used in India) is better,
because it is
>>> supported by a published open standard, for which
there's a clear and
>>> stable conversion from/to Unicode.
>>>
>>> My solution is supported by two standards: ISO-8859-1 and
Open Type. ISO-8859-1 is Basic Latin plus Latin-1 Extension part of
Unicode standard.
>> It is not supported by ISO-8859-1. ISO-8859-1 isfor Latin
letters, not Sinhalese ones.
>>
>> It is worth your traveling to America to learn what is a
character encoding. A character set is not anything you go and ask
permission to use it. If you use it, you have used it.
> I’ve just said your font is not supported by Latin-1 which
defines an encoding for some LATIN letters.
>
> Say whatever you wish, but it works!
It works but it is not supported by ISO-8859-1, nor by ISO-8859-2, nor
by ISO-8859-3, etc.…

> It is subjective for you to say that, Jean. At least for English, you
think 'how to spell this word'. For Singhala, you think 'how to write
this sound'. A Singhala person presses g for ග and a for අ without
hesitation. That too is subjective. When they in Sri Lanka and I made a
'better' keyboard for Unicode Sinhala, we assigned Singhala letters to
English keys by following that expectation. Earlier, they had ග where
period resides on the QWERTY layout. That makes people angry.
>
>
>
>>
>>
>>>
>>> Bottom line is this: If Latin-1 is good enough for English
and French, it is good enough for Singhala too.
>> No, because Sinhala is not written with Latin letters.
>>
>> Declarations like that won't work in a technical discussion. You
need to explain. Singhala is a language. Singhala native SCRIPT is the
traditional way it is written. When I write Jean I really entered the
four code points: 74 101 97 and 110. When you write naena, you enter 110
97 101 110 and 97. We think the former is a name of a pretty girl
>
> Which one? Jean? That’s a male name and only the first part of my
first name.
>
> Absolutely true.
>
>
>
>> and the latter is a name I made up not in a particular language.
>>
>>
>>
>>> And if Open Type is good for English and French, it is good
for Singhala too.
>> Of course.
>>
>> Thank you for that.
>>
>
>

Jean-François Colson
(Jean is another name but not mine.)
Received on Tue Jul 17 2012 - 13:14:09 CDT

This archive was generated by hypermail 2.2.0 : Tue Jul 17 2012 - 13:14:10 CDT