Re: No Invisible Character - NBSP at the start of a word

From: Peter Kirk (
Date: Fri Nov 26 2004 - 06:12:20 CST

  • Next message: Philippe Verdy: "Re: Misuse of 8th bit [Was: My Querry]"

    On 26/11/2004 03:40, Mark E. Shoulson wrote:

    > ...
    > I think part of what makes Biblical Hebrew so contentious is the
    > unstated assumption that "the BHS text of the Bible *must* be
    > considered plain-text." It's not necessarily so. It isn't
    > necessarily a bad rule to work with, but it isn't one we should take
    > for granted, and it's one we do need to examine and consider.

    I understand that this is not self-evident. But let's look at the
    arguments. The word forms which by my contention should be supported as
    plain text are the ones actually found, not just in a single Bible
    edition, but in Hebrew Bible manuscripts from the 10th century CE and in
    all printed editions, except perhaps for some simplified ones, until
    today. (Some of the special features which have already been accepted by
    the UTC, such as right METEG, are found in only some such manuscripts
    and editions, but this is not true of the Qere/Ketiv blended forms.) And
    the distinctions made have real semantic significance, they are not
    simply layout preferences. As I understand it, Unicode intends to be
    able to represent the semantically significant features of texts in
    general use. This is clearly a text in general use, and the special
    formatting features of it are semantically significant. Therefore they
    should be represented in Unicode.

    It is true that these special formatting features have a complex
    relationship to the actual phonetic realisation of the text, and can be
    fully understood only in conjunction with the marginal notes. But
    Unicode has never been intended to represent the phonetic realisation of
    a text, and it has certainly not been restricted to characters which are
    part of that phonetic realisation. The criterion for a Unicode character
    is not that it has a distinct sound, but that it has a distinct, and
    semantically significant, written form. These Qere/Ketiv blended forms
    are the actual written forms in the text, and as such, irrespective of
    how they might be pronounced or not pronounced, they are the ones which
    Unicode needs to represent.

    Peter Kirk (personal) (work)

    This archive was generated by hypermail 2.1.5 : Fri Nov 26 2004 - 11:03:47 CST