No Invisible Character - NBSP at the start of a word

From: Peter Kirk (peterkirk@qaya.org)
Date: Wed Nov 24 2004 - 06:36:04 CST

  • Next message: Peter Constable: "RE: Question on Canonical equivilance"

    I understand that the proposed INVISIBLE CHARACTER was rejected at the
    recent UTC meeting. I presume that the intention is that NBSP should be
    used instead.

    There are cases of words which start with spacing combining marks, for
    which there are no separate Unicode characters. For example, there are
    some unusual biblical Hebrew word forms (Ketiv consonants with Qere
    vowels, the forms printed in Hebrew Bibles) which start with spacing
    combining marks. For some examples (in fact this is intended to be an
    exhaustive list of such words), see
    http://www.qaya.org/academic/hebrew/Ketiv-Qere-difficult.pdf, the
    "blended forms" column of rows with the note "point before word".

    This UTC decision leaves is in a situation in which such words need to
    be represented in Unicode with NBSP and combining marks at the start of
    a word. Does this lead to problems with HTML, XML etc? Are there cases
    in which this word initial NBSP will be combined with a preceding word
    space, and so the intended word spacing and break opportunity (before
    the NBSP) may be lost?

    In the Hebrew case, it is probably necessary to precede the NBSP with
    RLM to ensure that the NBSP and combining mark are taken with the rest
    of the word as right-to-left. Does this inserted RLM affect the
    situation with HTML, XML etc?

    -- 
    Peter Kirk
    peter@qaya.org (personal)
    peterkirk@qaya.org (work)
    http://www.qaya.org/
    


    This archive was generated by hypermail 2.1.5 : Wed Nov 24 2004 - 11:29:49 CST