RE: ZWJ/ZWNJ in combining mark sequences

From: Language Analysis Systems, Inc. Unicode list reader (Unicode-mail@las-inc.com)
Date: Wed Nov 05 2003 - 14:11:54 EST

  • Next message: YTang0648@aol.com: "Re: UTF-16 inside UTF-8"

    >Here are my suggestions for Really Stoopid but unquestionably
    conformant, needs-no-new-characters-can-be
    >made-to-work-today, alternatives:
    >
    >Medial meteg: < CGJ, hataf patah, CGJ, meteg >
    >or
    >Medial meteg: <hataf patah, CGJ, CGJ, meteg >

    I should probably know better than to jump into this discussion,
    especially since I really don't know anything about Biblical Hebrew, but
    I could have sworn that this had been discussed here before. I thought
    what I remembered people saying was that the medial meteg was the
    high-runner case and the versions with the meteg on either side were the
    exceptions. If that were true, you'd get

    Medial meteg: < hataf patah, meteg >
    Left meteg: < hataf patah, CGJ, meteg >
    Right meteg: < meteg, CGJ, hataf patah >

    Simple enough, if it makes any sense linguistically.

    The discussion here seems to be suggesting that the left meteg is the
    high-runner case. If this is the case, my own uninformed opinion would
    be that you'd probably have to encode a new character. Using CGJ to
    encode the distinction in either of the ways shown above seems to be
    stretching the use of CGJ a little too far, and sticking ZWJ in the
    middle of a combining character sequence seems to open the door to bad
    things. You could, I guess, encode left meteg as <hataf patah, meteg>
    and medial meteg as <hataf patah, CGJ, meteg>, but this doesn't seem
    very intuitive.

    I'll shut up now...

    --Rich Gillam
      Language Analysis Systems, Inc.



    This archive was generated by hypermail 2.1.5 : Wed Nov 05 2003 - 15:03:05 EST