Re: "Interoperability is getting better" ... What does that mean? from Naena Guru on 2013-01-01 (Unicode Mail List Archive)

From: Naena Guru <naenaguru_at_gmail.com>
Date: Tue, 1 Jan 2013 17:53:36 -0600

It used to be that during HTML 4 days ISO8859-1 was the default character
set for pages that used SBCS (those that belong to Basic Latin and Latin
Extended-1). At least that is what the Validator (http://validator.w3.org/)
said.

(By the way, Unicode is quietly suppressing Basic Latin block by removing
it from the Latin group at top of the code block page (
http://www.unicode.org/charts/) and hiding it under different names in the
lower part of the page.)

Now the validator complains correctly that some characters in those pages
do not belong to ISO-8859-1, if you use bullet points, ellipse etc. It says
they come from Windows-1252. That is true. If you declare these pages as
UFT-8, then it throws off *all* Latin-1 characters and the web pages show
character-not-found glyph.

Windows-1252 replaces all Control codes (first 32 characters) in Latin-1
page with some common characters used by Eastern European languages and
some punctuation marks.

There is one main consideration in the mind of the web developer: Make the
file as small as possible. Try this: Make a text file in Windows Notepad
and save it in ANSI, Unicode and UTF-8 formats. ANSI file (Windows-1252)
will be the smallest. Why should people make their pages larger just to
satisfy some peoples idea of perfection? It reminds me of the Plain Text
and language detection myths.

On Mon, Dec 31, 2012 at 8:44 AM, Asmus Freytag <asmusf_at_ix.netcom.com> wrote:

> On 12/31/2012 3:27 AM, Leif Halvard Silli wrote:
>
>> Asmus Freytag, Sun, 30 Dec 2012 17:05:56 -0800:
>> The Web archive for this very list, needs a fix as well …
>>
>
>
> The way to formally request any action by the Unicode Consortium is via
> the contact form (found on the home page).
>
> A./
>
>
Received on Tue Jan 01 2013 - 17:55:58 CST

This archive was generated by hypermail 2.2.0 : Tue Jan 01 2013 - 17:55:59 CST