Re: ldml dtd location

From: Theo Veenker (Theo.Veenker@let.uu.nl)
Date: Mon Sep 26 2005 - 00:47:19 CST

  • Next message: Philippe Verdy: "Re: Dead keys"

    Philippe Verdy wrote:
    > From: "Mark Davis" <mark.davis@icu-project.org>
    >
    >> Theo Veenker wrote:
    >>
    >>> The current doctype declaration of an LDML file is:
    >>> <!DOCTYPE ldml SYSTEM "http://www.unicode.org/cldr/dtd/1.3/ldml.dtd">
    >>> In my setup I just want use a locally installed copy of the DTD. So
    >>> my request would be to use a public identifier for the official
    >>> 'location' (to be known by the app) and a relative system identifier
    >>> which one can use to find the DTD relative to the referring LDML file.
    >>> Something like this:
    >>> <!DOCTYPE ldml PUBLIC "-//Unicode Consortium//DTD Common Locale Data
    >>> Repository//EN" "../../dtd/ldml.dtd">
    >>
    >>
    >> 2. There is already a mechanism you can use to use a local, cached
    >> copy of the dtd. It speeds up processing of files quite dramatically.
    >> For the CLDR tools it's implemented in Java but you should be able to
    >> use similar mechanisms for other languages. Look at
    >> CachingEntityResolver in the CLDR repository.
    >
    >
    > I would not recommand changing the system URL in XML files (this would
    > require editing all of them, meaning that they are no more portable).
    > Mark is right, the solution is to implement an entity resolver, that
    > can, for example, map a list of internally supported public identifiers
    > to a local system URL (which may be embedded in the application itself
    > as an internal static resource), ignoring the specified system URL
    > specified in the XML document and which should only be used in absence
    > of a local copy of the entity.

    I will do that. But it still requires editing all the xml files because
    the files don't actually provide a public identifier, only a system url.

    >
    > Most web browsers do that internally for the HTML DTDs; in addition,
    > they use their own local internet cache to resolve external system URLs
    > if possible without having to query the remove server of that external
    > system URL.
    >
    > Only lazy implementations of validating XML parsers forget to implement
    > entity resolvers, resulting in poor performance and unnecessary multiple
    > requests to get those system entities. The API for implementing entity
    > resolvers is present in all common XML parser libraries...
    >

    I'm currely looking at libwww (w3c) which seems nice for adding network
    support to applications. It supports caching of documents. But I suppose
    that is not the same as caching entities. I'm using expat which itself
    has no provisions for caching entities. I'll figure it out.

    Theo



    This archive was generated by hypermail 2.1.5 : Mon Sep 26 2005 - 00:49:50 CST