From: Mark E. Shoulson (email@example.com)
Date: Mon May 24 2004 - 21:55:28 CDT
John Hudson wrote:
> Peter Constable wrote:
>> I was not involved in those discussions so cannot comment on them. I
>> just wish to point out that the MCW representation of Hebrew most
>> certain *is* supported in Unicode: MCW uses ASCII Latin letters and
>> punctuation characters to stand for Hebrew letters, vowel points and
>> accents, and those exact same ASCII characters are encoded in Unicode.
> This was an 8-bit hack, the point which Elaine and other Biblical
> Hebrew scholars make is that MCW explicitly encodes distinctions
> between some marks, based on positioning, that the Unicode Hebrew
> block unifies. This means that while MCW text can be easily converted
> to Unicode Hebrew, it is not possible to round-trip such conversion in
> the same way that Unicode provides for pre-existing 8-bit standard
> character sets.
Hmm... Is that even true anymore? I think the only ones remaining are
things like the auxiliary telishas and pashtas, which can and should
properly be handled by the font. If the text is *valid*, there can
never be an auxiliary pashta on a non-final letter, nor a
"non-auxiliary" pashta on a final letter, thus one can unambiguously
reconstruct which was what. Zarqa and Tsinnor are misnamed, but
workable. The metegs are still not done quite satisfactorily, though.
And the canonical classes are an unfixable mess, and there are a few
other small things too, I guess...
> One of the unfortunate aspects of this is that the ASCII-hack MCW
> encoding will likely remain the source encoding for many electronic
> Biblical Hebrew texts for some time to come, even if published texts
> are re-encoded as Unicode Hebrew, since MCW permits simple and
> unambiguous plain-text encoding of distinctions that are important to
> textual analysis. For example, although my clients at Libronic use
> Unicode encoding for their electronic BHS edition (because it provides
> greater interchangeability), they maintain an MCW encoded text as
> their master source. So much for the 'universal' character set...
Hey, the "Universal" set isn't always the Right Tool for specialized
jobs. It's nice when it is, but MCW has *worked* for them for a long
time, and since they control it, it *must* have precisely the features
they need. Why not keep with what works?
This archive was generated by hypermail 2.1.5 : Mon May 24 2004 - 21:56:13 CDT