RE: No Invisible Character - NBSP at the start of a word

From: Jony Rosenne (
Date: Fri Nov 26 2004 - 13:05:58 CST

  • Next message: Philippe Verdy: "Re: CGJ , RLM"

    Normal printed text is hardly ever plain text. It contains headings,
    highlighted phrases, paragraphs etc.

    The Hebrew Bible has its unique non-plain text artifacts, such as
    Ketiv/Qere. If standardization is necessary, take it to the SGML people.

    Simple cases of Ketiv/Qere can be managed without mark-up, for example when
    the vowels of the Qere happen to fit the Ketiv, but this is not a general
    solution nor does it imply that it is not a markup item.


    > -----Original Message-----
    > From:
    > [] On Behalf Of Peter Kirk
    > Sent: Friday, November 26, 2004 2:12 PM
    > To: Mark E. Shoulson
    > Cc: Dean Snyder; Unicode List
    > Subject: Re: No Invisible Character - NBSP at the start of a word
    > On 26/11/2004 03:40, Mark E. Shoulson wrote:
    > > ...
    > >
    > > I think part of what makes Biblical Hebrew so contentious is the
    > > unstated assumption that "the BHS text of the Bible *must* be
    > > considered plain-text." It's not necessarily so. It isn't
    > > necessarily a bad rule to work with, but it isn't one we
    > should take
    > > for granted, and it's one we do need to examine and consider.
    > I understand that this is not self-evident. But let's look at the
    > arguments. The word forms which by my contention should be
    > supported as
    > plain text are the ones actually found, not just in a single Bible
    > edition, but in Hebrew Bible manuscripts from the 10th
    > century CE and in
    > all printed editions, except perhaps for some simplified ones, until
    > today. (Some of the special features which have already been
    > accepted by
    > the UTC, such as right METEG, are found in only some such manuscripts
    > and editions, but this is not true of the Qere/Ketiv blended
    > forms.) And
    > the distinctions made have real semantic significance, they are not
    > simply layout preferences. As I understand it, Unicode intends to be
    > able to represent the semantically significant features of texts in
    > general use. This is clearly a text in general use, and the special
    > formatting features of it are semantically significant.
    > Therefore they
    > should be represented in Unicode.
    > It is true that these special formatting features have a complex
    > relationship to the actual phonetic realisation of the text,
    > and can be
    > fully understood only in conjunction with the marginal notes. But
    > Unicode has never been intended to represent the phonetic
    > realisation of
    > a text, and it has certainly not been restricted to
    > characters which are
    > part of that phonetic realisation. The criterion for a
    > Unicode character
    > is not that it has a distinct sound, but that it has a distinct, and
    > semantically significant, written form. These Qere/Ketiv
    > blended forms
    > are the actual written forms in the text, and as such,
    > irrespective of
    > how they might be pronounced or not pronounced, they are the
    > ones which
    > Unicode needs to represent.
    > --
    > Peter Kirk
    > (personal)
    > (work)

    This archive was generated by hypermail 2.1.5 : Fri Nov 26 2004 - 13:06:29 CST