From: Peter Kirk (email@example.com)
Date: Fri Nov 26 2004 - 16:37:12 CST
On 26/11/2004 21:27, Doug Ewell wrote:
>One useful litmus (or lackmus) test for this Hebrew example would be
>whether the text in question is still legible, with its original
>meaning, when reduced to plain text representable in today's Unicode.
>If the special Ketiv/Qere handling is needed only because It Is The
>Word, and This Is How It Was Written, then this is probably a
>paleographic distinction and out of scope for plain text. If it
>genuinely changes the spelling, that is another matter.
Well, for a start we need to define what might be meant by "reduced to
plain text". In this case there is simply no logical way to describe
what is written as plain text plus markup. I suppose some kind of markup
like <ketiv>KKKK</ketiv><qere>QqQqQ</qere> could be used (K = Ketiv base
character, Q = Qere base character, q = Qere diacritical mark), and this
would preserve the original meaning, but it would not show how the
individual Ketiv base characters and Qere combining marks are
graphically combined, i.e. it would not distinguish the written
"blended" forms KqKqKK and KqKKqK, which are graphically distinct. And
certainly if the markup were simply stripped from this the resulting
form KKKKQqQqQ would not be legible.
But fortunately this whole issue is a storm in a teacup. For Unicode
does provide quite adequate ways of representing every known Ketiv and
Qere blended form - since we sorted out the Yerushala(y)im issue more
than a year ago. The only real problem comes when the Qere is longer
than the Ketiv and the blended form looks something like qKqKqKq, so
starting with a combining mark. It is well established that such a
combining mark with a blank base character may be represented by NBSP
followed by the combining mark (and the alternative with SPACE is now
apparently deprecated). And it seems that the UTC in rejecting the
INVISIBLE LETTER proposal, and in proposing instead certain changes to
the properties of NBSP which are currently out for public review, has
reaffirmed this usage.
So I only raised this issue to clarify exactly how NBSP should be used
in such cases. Although I have been rather confused by the responses I
have received, I think the situation is clear as follows: NBSP may be
used with a combining mark at the start of a word, but should be
preceded by ZWSP to ensure a break opportunity before the word (although
this should become unnecessary if the proposed revision to UTR #14 is
accepted) and also by RLM to ensure correct bidi behaviour.
Please let me know if any of you disagree with this conclusion.
-- Peter Kirk firstname.lastname@example.org (personal) email@example.com (work) http://www.qaya.org/
This archive was generated by hypermail 2.1.5 : Fri Nov 26 2004 - 18:59:34 CST