Re: ISO 10646 compliance and EU law

From: Kenneth Whistler (
Date: Tue Jan 04 2005 - 13:22:55 CST

  • Next message: Antoine Leca: "Re: Myanmar script, Pali language and other unencoded conjuncts or punctuations"

    This thread has wandered far afield into banknote design
    issues. :-)

    But getting back on topic, Antoine Leca commented:

    > Also, conformance to 10646 (very different from compliance to Unicode)
    > requires the mention of the UTF, the implementation level and the
    > collections used. So it would mean that ANY software (that deals with
    > strings, and to which the "certain circumstances" might apply) selled here
    > should make these parameters clear.
    > Well, looks like about nobody is complying the law :-).
    > Or else, that the "certain circumstances" are rather narrow.

    I wouldn't draw quite the same conclusions.

    10646 states:

    "A claim of conformance shall identify the adopted form, the
    adopted implementation level and the adopted subset by means of
    a list of collections and/or characters."

    Note the crucial point -- this is about a *claim* of conformance.

    And this is the reason why the Unicode Standard includes,
    in Appendix C.5, just such a claim, indicating (p. 1352) that
    the Unicode Standard has the following features:

      * Numbered subset 305 (UNICODE 4.0)
      * UTF-8, UTF-16, or UCS-4 (= UTF-32)
      * Implementation level 3
    Unicode implementations can then implicitly derive their claims
    of conformance to 10646 from Unicode's own claim of conformance,
    as long as they then conform to the Unicode Standard itself.
    This does not, of course, prevent an implementation from making
    a more restricted claim of conformance, as in supporting only
    a single Unicode Encoding Form or a much more restricted subset
    of characters.

    However, *being* in conformance to 10646 doesn't necessarily involve
    making an explicit claim of conformance. The process simply
    has to ensure that its usage of "CC-data-elements" follows
    certain rules:

      "all the coded representations of graphic characters within
       that CC-data-element conform to clauses 6 and 7, ..."
      [Unicode translation: the characters are interpreted according
        to the specification in the standard]
      "... to an identified form chosen from clause 13 or annex C
        or annex D, ..."
      [Unicode translation: the data is represented in a well-formed
        code unit sequence in a specified Unicode Encoding Form]
      "... and to an identified implementation level chosen from
        clause 14"
      [Unicode translation: Unicode implementations are "Level 3" in
        principle, even if they don't handle combining marks as
        part of their subset.]
      "all the graphic characters represented within that
        CC-data-element are taken from those with an identified
        subset (clause 12)"
      [Unicode translation: contain only assigned code points in
        whatever version of the stanadard they support]
      "all the coded representations of control functions within
        the CC-data-element conform to clause 15"
      [Unicode translation: any control characters follow the normal
        rules of representation in Unicode Encoding Forms.]
    Now a string-handling process can do all of that without having
    some explicit claim of conformance built in. It isn't as if a
    string API needs to first post a pop-up window message indicating
    its claim of conformance parameters before it can process the
    string. A claim of conformance is a *meta*-statement that is made
    *about* software, and can be done in documentation or by implicit
    reference to some other claim of conformance.

    Since all significant implementations of ISO/IEC 10646 are in
    fact implementations of the Unicode Standard, and since the
    Unicode Standard makes the claim of conformance to 10646 for them,
    software claiming conformance to the Unicode Standard is making
    its claim of conformance to 10646 implicitly. As long as it
    follows the basic rules of conformance -- which are really pretty
    simple and easy to deal with -- then it should be covered for any
    generic regulations or laws dealing with conformance to 10646.

    Of course, there may be further requirements on software, such
    as explicit specifications that certain characters be displayed,
    accepted for input, and be otherwise processed. Such requirements
    are really the more important ones for actual applications,
    and go beyond the basic issues of conformance to the standards.


    This archive was generated by hypermail 2.1.5 : Tue Jan 04 2005 - 13:27:35 CST