RE: japanese xml

From: Misha.Wolf@reuters.com
Date: Thu Aug 30 2001 - 14:10:52 EDT


I'm glad to see that one recipient didn't read something
bizarre into my prefectly simple and helpful reply to the
questionner.

Misha

On 30/08/2001 18:27:10 Addison Phillips wrote:
> That's not what he said in the responses *I* read. Perhaps I missed one on
> this thread. As near as I recall, Misha wrote:
>
> "Of course, EUC (EUC-JP in the
> case of Japanese) may cover all the characters you require, in which
> case there is no problem. Additionally, if you are thinking of XML (or
> HTML) then you can encode *all* Unicode characters in an EUC-encoded
> document, by employing numeric character references for characters
> outside the EUC character repertoire."
>
> IOW> That's not "EUC-unicode". I don't see a mention anywhere of that
> (hypothetical) encoding. That's "EUC-JP with characters outside EUC-JP
> represented as NCRs", and our parser handles that quite well...
>
> Addison
>
> -----Original Message-----
> From: Ayers, Mike [mailto:Mike_Ayers@bmc.com]
> Sent: Thursday, August 30, 2001 10:00 AM
> To: 'Addison Phillips [wM]'
> Cc: unicode@unicode.org
> Subject: RE: japanese xml
>
>
>
> > From: Addison Phillips [wM] [mailto:aphillips@webmethods.com]
> > Sent: Thursday, August 30, 2001 09:51 AM
>
>
> > 4. However, you can use any other encoding, provided you tag the file
> > appropriately (so that the parser knows what the encoding is and can
> > translate it to its internal representation).
>
> Slight but relevant correction: you can use any encoding of which
> the parser is aware.
>
> > 5 You are not required to use EUC-JP for your Japanese XML
> > files: you can
> > use the Unicode encodings directly. In some cases, though, your file
> > editting software may make it easier to work with EUC-JP (or
> > Shift-JIS/Microsoft Code Page 932).
>
> Misha was not talking about EUC-JP, rather EUC-unicode (or some name
> like that), which encodes unicode scalar values using the EUC method, and
> uses character references for those values (most of them) that are outside
> of the EUC encoding range. Have you tested your parser against that?
>
>
> /|/|ike
>
>

-----------------------------------------------------------------
        Visit our Internet site at http://www.reuters.com

Any views expressed in this message are those of the individual
sender, except where the sender specifically states them to be
the views of Reuters Ltd.



This archive was generated by hypermail 2.1.2 : Thu Aug 30 2001 - 15:06:15 EDT