Re: Fw: Biblical Hebrew: possible solution for XML

From: John Cowan
Date: Fri Jun 27 2003 - 13:17:06 EDT

    Philippe Verdy scripsit:

    > Given that XML will require normalization for texts identified as
    > being Unicode encoded (UTF-8 and others), couldn't a document be
    > labelled so that the normalization step be removed from the XML
    > processing, using a "ISO-10646-8" encoding name (for the UTF-8
    > encoding scheme)?

    No. The W3C rule is "Check normalization on input (parsing), create
    normalization on output (creating or transcoding)", and it applies to
    all encodings, since any character may be expressed in any encoding
    using character references.

    However, normalization checking is still a SHOULD even in XML 1.1, and at
    most a MAY (not actually mentioned at all) in XML 1.0, the current version.

    John Cowan
