Re: About that alphabetician...

From: Brian Doyle (brian@gael-image.com)
Date: Thu Sep 25 2003 - 13:49:15 EDT

Next message: Marco Cimarosti: "RE: About that alphabetician..."

Previous message: Maung TunTunLwin: "Re: Questions on Myanmar encoding"
In reply to: Eric Muller: "Re: About that alphabetician..."
Next in thread: Michael Everson: "Re: About that alphabetician..."
Reply: Michael Everson: "Re: About that alphabetician..."
Reply: John Burger: "Re: About that alphabetician..."
Reply: Curtis Clark: "Re: About that alphabetician..."
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

Eric,

Forgive my density. I¹m not sure that I understand. Are you arguing that an
ASCII encoding scheme (ISO-8859-1) is not a limitation because,
semantically, all of the characters (a, b, c, etc.) also exist in the
Unicode scheme?

It makes sense to me that ASCII is not a limitation for those documents that
are limited to that character set. But, your own message, ³which contains
U+10DB ? GEORGIAN LETTER MAN and U+092E Ã DEVANAGARI LETTER MA² triggers an
error message in my own email client (Entourage X), namely:

³Some text in this message is in a langauge that your computer cannot
display.²

I¹m not certain if I¹m seeing this because I don¹t possess a font to display
those characters or some other reason. I suspect that this is the reason
because, when I try to look up those character's in OS X's Character
Palette, the Georgian and Devongari Unicode blocks show up blank.

The observation that I, the ³Irish (American) colleague,² made to Michael
was that there is a sentence in the NYT article displayed in my browser that
dropped the OOE7 LATIN SMALL LETTER C WITH CEDILLA (e.g., François).

There's nothing in the paragraph in question to indicate that there is a
missing character--nor is there a numeric code displayed for a savvy user to
look up.

Surely in this context, we would agree that the semantic content was
distorted, yes?

Sincerely,
Brian Doyle
Unicode newbie

On 9/25/03 11:54 AM, "Eric Muller" <emuller@adobe.com> wrote:

>
>
> Michael Everson wrote:
>> An Irish colleague here said he liked the article but noted that the Times'
>> web directors don't use Unicode....
>>
>>
>>> ...
>>> <meta http-equiv="charset" content="iso-8859-1">
>>> ...
>>>
>>>
> There is an alternative point of view, which says that charset declared in an
> HTML (or XML) document is no more than an encoding scheme, and that all
> characters in those documents are fundamentally Unicode characters (i.e. they
> start in life with the full semantic of Unicode, they don't inherit it on the
> occasion of character set conversion). That view is supported by the XML spec
> itself, and by the infoset definition. And because we have numeric character
> entities, using an iso-8859-1 encoding scheme is not really a limitation:
> witness this message, which contains U+10DB ? GEORGIAN LETTER MAN and U+092E Ã
> DEVANAGARI LETTER MA.
>
> Eric.
>
>

Next message: Marco Cimarosti: "RE: About that alphabetician..."
Previous message: Maung TunTunLwin: "Re: Questions on Myanmar encoding"
In reply to: Eric Muller: "Re: About that alphabetician..."
Next in thread: Michael Everson: "Re: About that alphabetician..."
Reply: Michael Everson: "Re: About that alphabetician..."
Reply: John Burger: "Re: About that alphabetician..."
Reply: Curtis Clark: "Re: About that alphabetician..."
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.5 : Thu Sep 25 2003 - 14:43:15 EDT