Re: XML attribute normalization and Unicode in C language

From: Din%$h (xydinesh@gmail.com)
Date: Mon Jun 06 2005 - 06:02:23 CDT

  • Next message: Doug Ewell: "Re: UTF-8 text files"

    Hi Mike,
    Both UTF-8 and UTF-16 encoding support was added to basic soap parser called

    "Guththila". I think It will be helpful to you. Take a look on it.

    http://www.cse.mrt.ac.lk/~premalwd/Projects/Guththila.html

    It was developed in C++ language. But those convertion methods implemented
    as seperated funtions.

    thanks,
    Dinesh

    On 6/4/05, Philippe Verdy <verdy_p@wanadoo.fr> wrote:
    >
    > Why not using a XML parser to do this job?
    >
    > Using Xerces with the SAX interface to enumerate the various items will
    > allow you to support lots of encodings (including UTF-8 and UTF-16), then
    > in
    > the callback that receives the parsed and isolated string items, you can
    > use
    > a normalization function to transform them, and then generate the new XML
    > document on the fly.
    >
    > It's really not complicate to do with the Xerces+ICU pair, and an example
    > of
    > a simple transformation of a XML document.
    >
    > You could use a DOM-based API as well (but DOM requires parsing the whole
    > document before you can browse the elements and attributes tree to
    > generate
    > a new document; one interest if that DOM naturally "normalizes" the values
    > of attributes and their relative order, in addition to resolving the
    > various
    > entities, allowing you for example to normalize and unify the namespaces
    > as
    > well if you want to build a coherent set of XML files using the same set
    > of
    > namespace prefixes).
    >
    > ----- Original Message -----
    > From: "Mike Hao" <mike_jjhao@yahoo.com>
    > To: <unicode@unicode.org>
    > Sent: Friday, June 03, 2005 6:41 AM
    > Subject: XML attribute normalization and Unicode in C language
    >
    >
    > > Hi All,
    > >
    > > I am not sure if this is the right group to post my
    > > question. Hope I can get some help or hint from you.
    > >
    > > I am working on a project, which need to normalize XML
    > > attribute values using C programming language. I need
    > > to support UTF-8 and UTF-16 encodings. Currently I can
    > > not think of a good solution to it. Does anyone have
    > > such a experience to share with me? Or could you tell
    > > me what's the right way to do it?
    >
    >
    >
    >

    -- 
    W.Dinesh Premalal
    premalwd@cse.mrt.ac.lk
    http://www.cse.mrt.ac.lk/~premalwd/
    


    This archive was generated by hypermail 2.1.5 : Mon Jun 06 2005 - 06:03:40 CDT