Re: No Invisible Character - NBSP at the start of a word

From: Peter Kirk (peterkirk@qaya.org)
Date: Wed Nov 24 2004 - 18:53:12 CST

  • Next message: Rick McGowan: "New Public Review Issue"

    On 24/11/2004 22:23, Peter Kirk wrote:

    > On 24/11/2004 22:00, Asmus Freytag wrote:
    >
    >> ...
    >> The sequence SPACE NBSP *does* not allow a break after the SPACE
    >> under the line breaking rules we publish in UAX#14.
    >>
    >> The common usage in HTML, is to use one or more NBSP followed by
    >> SPACE to mark a wider space, that allows a break at the end. NBSPs
    >> are not coalesced with other spaces.
    >>
    >>> In the Hebrew case, it is probably necessary to precede the NBSP
    >>> with RLM to ensure that the NBSP and combining mark are taken with
    >>> the rest of the word as right-to-left. Does this inserted RLM affect
    >>> the situation with HTML, XML etc?
    >>
    >>
    >>
    >> You are always free to surround the NBSP with other format
    >> characters, such as RLM or ZWSP, to tailor whatever behavior those
    >> format characters affect.
    >>
    > Thank you.
    >
    > What if I used the sequence <RLM, NBSP, combining mark>? Would I then
    > get a break opportunity before the RLM, if it is preceded by SPACE?
    > Presumably LRM could be used similarly if the same situation occurs in
    > a left-to-right language.
    >
    I note that there is a relevant change being proposed to UAX #29 (public
    review issue #51), in that NBSP is now to be treated as a letter for
    determination of word and sentence boundaries. This certainly helps with
    the use of NBSP as a carrier for spacing diacritics, as e.g. in Hebrew.

    Also the following clarification is being proposed for UAX #16 on line
    breaking (public review issue #56):

    > The preferred base character for showing combining marks in isolation
    > is U+00A0 No-Break SPACE. If a line break before or after the
    > combining sequence is desired, U+200B ZERO WIDTH SPACE can be used.
    > The use of U+0020 SPACE as a base character is deprecated.

    But this draft also states:

    > when NBSP follows SPACE, there is a break opportunity after the SPACE
    > and NBSP will go as visible space onto the next line.

    This is different from what Asmus stated above: "The sequence SPACE NBSP
    *does* not allow a break after the SPACE". So is this actually a
    proposed change to the line breaking rules? If so, it is one I support.

    -- 
    Peter Kirk
    peter@qaya.org (personal)
    peterkirk@qaya.org (work)
    http://www.qaya.org/
    


    This archive was generated by hypermail 2.1.5 : Wed Nov 24 2004 - 19:47:46 CST