Re: "Interoperability is getting better" ... What does that mean?

From: Leif Halvard Silli <>
Date: Wed, 09 Jan 2013 01:55:36 +0100

Naena Guru, Tue, 8 Jan 2013 15:56:52 -0600:

> The statement,
> the death of most character sets makes everyone's systems smaller and
> faster
> is *FALSE*. Compare the sizes of the following two files that are copies of
> a newspaper article. The top part in red has few more words in romanized
> Singhala in the romanized Singhala file. Notice the size of each file:
> 1. size:38,092 bytes
> 2. size:18,922 bytes
  [ … ]
> Again *demonstrably WRONG*

To double check your statement, I saved the above tow pages in Safari’s
webarchive format[1] and compared the resulting size of each archive
file. The benefit of doing such a comparison is that we then get to
count both the HTML page *plus* all the extra fonts that is included in
the "romanized Singhala file". Thus, we get a more *real* basis for
comparing the relative size of the two pages. Here are the results:

1., webarchive size: 205 459 bytes
2., webarchive size: 223 201 bytes

As you can see, the "romanized Singhala file" looses - it becomes
bigger than the UTF-8 version. I suppose the reason for this is that
for the "romanized Singhala file", then the folder has to download
fonts in order to display the "romanized Singhala". (It tried to do the
same in Firefox, using its ability to save the "complete" page, however
it did for some reason not work).

I also ran a test on both pages with the YSlow service.[2] Here are the
total weight of each page, according to YSlow, when run from Firefox:

1., YSlow size: 92.7K
2., YSlow size: 65.7K

And here are the YSlow results from Safari:

1., YSlow size: 11.2K
2., YSlow size: 9.0K

Rather interesting that Safari and Firefox differs that much. But
anyhow, the YSlow results are pretty clear, and demonstrates that while
the "romanized Singhala" page is smaller, it is only between 20 and 30
percent smaller than the Unicode page.

However, despite the slightly bigger size, YSlow in Firefox (don't know
how to see it in Safari) *still* reported that the Unicode page loaded

Further more, when I inspected the source code of these to documents,
then I discovered that for the the Unicode file, you included *two*
downloadable fonts, whereas for the "romanized Singhala" page, you only
included *one* downloadable font. (Why? Because both files actually
contains some "romanized Singhala"!). Before we can *really* take those
two test pages seriously, you must make sure that both pages use the
same amount of fonts! As it is, then i strongly suspect that if you had
included the same amount of downloadable fonts in both pages, then the
Unicode page would have won.

Of course, the "romanized Singhala" page has many usability problems as
well: 1) It doesn't work with screen readers (users will hear the text
as latin text), 2) it doesn’t work with Find-in-page search (users will
type in Sinhala, but since the content is actually Latin, they won’t
find anything on the page), 3) the title of the "romanized Singhala"
page is (I believe) not actually readable as Singhala, 4) there are
many browsers in which the "romanized Singhala" file will not display:
text browsers, Opera and any browser where CSS is disabled. 5) You get
all kinds of problems for form submission.

Conclusion: Your claims about the file size advantage of "romanized
Singhala" seems grossly exaggerated, if at all true, based as they are
on a test of two files which aren actually equal when it comes to the
extra CSS stuff that they embed.


leif halvard silli
Received on Tue Jan 08 2013 - 18:59:21 CST

This archive was generated by hypermail 2.2.0 : Tue Jan 08 2013 - 18:59:22 CST