Re: No Invisible Character - NBSP at the start of a word

From: Asmus Freytag (
Date: Sat Nov 27 2004 - 22:14:44 CST

  • Next message: Flarn: "Base character as a combining character"

    At 04:58 PM 11/27/2004, John Hudson wrote:
    >Mark E. Shoulson wrote:
    >>Well, that's the difference under discussion. The "plain text" would
    >>seem to be either the qere or the ketiv (but not the combined "blended"
    >>form), since each of those is somewhat sensible.
    >Is there some place in the standard where it says text must be sensible?

    No. The intent is to allow the author to unambiguously present his text. If
    that includes deliberate mis-spellings or other funny business, that's
    between the author and the reader.

    In scripts with complex layout, of course, not all random character soup
    would be rendered the same by all systems. Which, I think is the point
    here. If this is a rather commonly used device, then in principle it's
    possible to ask why can this not be part of plain text.

    If the necessary mechanisms to do this are cheap and simple, the answer is
    often to bring such things under the plain text umbrella. If it's
    complicated, the answer should be to leave it to mechanisms such as markup
    that deal well in (whatever required kind of) complexity.

    This segues nicely to an answer to a different issue raised earlier in this

    Interlinear annotation characters were added to Unicode before we
    discovered a more general mechanism. Their main intent never was for
    interchange, but for internal representation, where the special character
    codes serve as anchors or placeholders in the text stream, but where
    formatting information is kept in a side buffer. Nowadays, we have 66
    generic non-characters that are the correct tool for such process-internal use.


    This archive was generated by hypermail 2.1.5 : Sat Nov 27 2004 - 22:16:34 CST