Re: No Invisible Character - NBSP at the start of a word

From: Peter Kirk (
Date: Mon Dec 06 2004 - 12:57:09 CST

  • Next message: Doug Ewell: "Invalid UTF-8 sequences (was: Re: Nicest UTF)"

    On 06/12/2004 17:41, Peter Constable wrote:

    > ...
    >At this point, I would ask that people move from voicing critiques and
    >stating inadequacy to making concrete proposals that identify precisely
    >what is inadequate and precisely how that can be remedied.
    I tried to do this about a week ago, on the Hebrew list which is the
    proper place for discussion of Hebrew-specific issues rather than the
    general issue with which this thread started. But there seems to be a
    fear of discussing anything on this Hebrew list at the moment, for fear
    of starting a flame war. (I can assure you all that this cannot happen,
    because that list is being closely moderated.)

    It is my understanding that there is in fact no inadequacy with Unicode
    as currently specified. The only potential problem is a reluctance to
    recognise that NBSP + vowel point and/or base character + two vowel
    points are sometimes necessary for representation of unusual but valid
    Hebrew word forms. This of course implies that rendering engines and
    fonts should support such combinations, and avoid for example flagging
    them as illegal e.g. by showing dotted circles.

    I have asked several times if anyone has examples of Qere/Ketiv blended
    forms which cannot be represented in Unicode as currently specified, and
    no one has come up with anything. So I tentatively conclude that there
    is no actual problem with Unicode in this respect. For I can agree in
    part with Peter C: Unicode can represent any combination of consonants
    and vowels in any order, with the CGJ mechanism to prevent unwanted
    reordering, except that an initial NBSP is required when a word starts
    with a vowel point.

    Peter Kirk (personal) (work)

    This archive was generated by hypermail 2.1.5 : Mon Dec 06 2004 - 13:03:36 CST