RE: Is there Unicode mail out there?

From: Ayers, Mike (Mike_Ayers@bmc.com)
Date: Thu Jul 19 2001 - 14:11:35 EDT


> From: Shigemichi Yazawa [mailto:yazawa@globalsight.com]

> XML states "Its goal is to enable generic SGML to be served, received,
> and processed on the Web in the way that is now possible with HTML."
> But, in my opinion, XML has outgrown its original goal way too
> far. XML seems to be used in every aspect of software engineering
> these days.

        True, but don't blame W3C for the digital hammer effect.

> Tagging disallowed characters is one way to work around the
> problem. But I don't buy this solution for two reasons.
>
> 1. Markup is for describing a document's structure. 1 Introduction
> says "Markup encodes a description of the document's storage layout
> and logical structure."

        That's how it works in theory. In practice, however, pictures,
applets, and many other non-structural components are encoded with markup.

> 2. This is a proprietary solution. To get the original character, the
> apprication needs to know the semantics of the markup and needs to
> know how to decode the data appropriately. If it's the standard
> encoding like NCR, that's fine because everybody knows how to deal
> with it. But the tagging is specific to a DTD. It makes difficult
> to interchange the data.

        I'm proposing it as a convention, not a proprietary solution. I
agree that a standard solution would be preferred, especially Martin's
suggestion of permitting the escape codes but not the characters. I
proposed the markup as a workaround until a better solution could be found.

> This character restriction in XML makes a XML document creation
> difficult.

        The work has to be done somewhere. Emerging technologies must be
compatible with existing ones, and some old technologies hang around a long
time. Really, the disallowing of control characters makes sense, since
their interpretation in so many exisiting protocols is "wreak havoc upon the
unsuspecting". You simply can't send these characters around the internet
and expect them to arrive unchanged.

/|/|ike



This archive was generated by hypermail 2.1.2 : Thu Jul 19 2001 - 15:41:41 EDT