Re: Questions on ZWNBS - for line initial holam plus alef

From: Philippe Verdy (verdy_p@wanadoo.fr)
Date: Wed Aug 13 2003 - 14:19:48 EDT

  • Next message: Peter Kirk: "Re: Questions on ZWNBS - for line initial holam plus alef"

    ----- Original Message -----
    From: "John Cowan" <cowan@mercury.ccil.org>
    To: "Peter Kirk" <peter.r.kirk@ntlworld.com>
    Cc: <unicode@unicode.org>
    Sent: Wednesday, August 13, 2003 5:31 AM
    Subject: Re: Questions on ZWNBS - for line initial holam plus alef

    > Peter Kirk scripsit:
    >
    > > Philippe or anyone else, would it be "XML-safe" to use NBSP rather
    than
    > > SP as the base character for spacing diacritics in XML? Perhaps
    that's
    > > the answer here. I know there are still some issues of detail
    concerning
    > > the line breaking, but apart from that is there any other problem?

    For XML, using NBSP would be safe, however this is another caveat as it
    introduce a non-break property, which may be an issue for the rendering,
    but normally not for text processing. This can be corrected by saying
    that
    NBSP+combining does not have a non-break property, and that a
    "don't break here" format control can be used if needed to specify the
    breaking behavior.

    In that case, this change in properties of the combining sequence
    (in fact something that was still not specified until now) would be
    harmless (as the behavior was not clearly specified and implementation
    dependant), and we could say that SPACE+diacritics is deprecated
    in favor of NBSP+diacritics (which would NOT inherit the non-breaking
    behavior but would have its own properties).

    > NBSP is not usable in attribute values other than those of type CDATA,
    > but it is usable in character content. XML does not consider it
    whitespace
    > (the only whitespace characters are LF, SPACE, TAB and marginally CR.

    And NEL (for compatibility with EBCDIC systems).



    This archive was generated by hypermail 2.1.5 : Wed Aug 13 2003 - 15:22:09 EDT