Re: Surrogate pairs and UTF-8

From: Edward Trager (ehtrager@umich.edu)
Date: Thu Jun 22 2006 - 14:30:38 CDT

  • Next message: Pavils Jurjans: "Re: Surrogate pairs and UTF-8"

    Hi, Pavils,

    ... Correct me if I am missing something:

    AJAX frameworks presumably have no problem whatsover transferring data
    directly in UTF-8 format. UTF-8 is the default encoding for XML. So, once
    the data get to the client, all one has to do is parse the UTF-8 strings
    directly out of the XML (assuming AJAX based on XMLHttpRequest) and wrap them
    inside of some XHTML tags for display. Where is the need to escape strings
    in XML? UTF-8 can encode all Unicode points.

    -- Ed Trager

    On Thursday 22 June 2006 13:35, you wrote:
    > Ok, group, so here's the fruit, I make it publis for the benefit of all:
    > http://www.jurjans.lv/dhtml/utf8.html
    >
    > The page contains both encoder and decoder in JavaScript. I belive the
    > implementation is correct, however I have not mass-tested it with all those
    > fancy antique scripts.
    >
    > As I already said, the need to do this in JavaScript comes from necessity
    > to transfer data to and from server in AJAX-based framework. That is,
    > submit and receive complex data without page refresh. It is natural with to
    > create XML format packages for those data, and when it comes to
    > transferring any kind of strings, one needs to escape them somehow so that
    > all codepoints pass through the XML format. I chose to follow the encoding
    > that is provided by function encodeURIComponent(), and I just needed to
    > rewrite it in JavaScript, to support older browsers. So I did that. To be
    > sincere, I can not think of any alternative method that would allow total
    > unicode support for transferring string data, together with oter complex
    > and typed data like dates, booleans and regular expressions.
    >
    > Addison, you should think about JavaScript in wider context than just web.
    > The ECMA script is supported in very many environments, and whenever the
    > talk goes about creating files and/or transferring data to server, some
    > hand-coded encoding sequences may come handy.
    >
    > Regards,
    >
    > Pavils Jurjans



    This archive was generated by hypermail 2.1.5 : Thu Jun 22 2006 - 16:11:03 CDT