RE: Folding algorithm and canonical equivalence

From: Jony Rosenne (rosennej@qsm.co.il)
Date: Tue Jul 20 2004 - 14:14:04 CDT

  • Next message: E. Keown: "RE: Folding algorithm and canonical equivalence"

    Correction:

    05C3 (not 05C0) is a punctuation mark often used in unpointed religious
    books to indicate the end of the sentence, similar to a full stop.

    05BE is the Hebrew hyphen.

    Neither should be folded in the general case.

    Jony

    > -----Original Message-----
    > From: unicode-bounce@unicode.org
    > [mailto:unicode-bounce@unicode.org] On Behalf Of Peter Kirk
    > Sent: Monday, July 19, 2004 8:53 PM
    > To: Mark E. Shoulson
    > Cc: Jony Rosenne; 'Unicode List'
    > Subject: Re: Folding algorithm and canonical equivalence
    >
    >
    > On 19/07/2004 03:20, Mark E. Shoulson wrote:
    >
    > > ...
    > >
    > > Jony's right: when it's down to brass tacks in Hebrew, it's
    > consonants
    > > and whitespace (and punctuation, I guess).
    > >
    > Agreed. But then there are a few characters which are not combining
    > marks but which are really part of the accent system and so should
    > perhaps be stripped when points are removed: 05C0
    > paseq/legarmeh, which
    > should be deleted; and 05BE maqaf, which should be replaced
    > by a (word
    > dividing) space. For 05C0 is an annotation which certainly
    > has no place
    > in an unpointed text; and in an accented text whether two words are
    > separated by maqaf or space depends on their accentuation,
    > and space is
    > always used in unaccented texts.
    >
    > Within the biblical text it would also be logical to delete 05C3 sof
    > pasuq, but its use elsewhere as punctuation suggests otherwise.
    >
    > --
    > Peter Kirk
    > peter@qaya.org (personal)
    > peterkirk@qaya.org (work)
    > http://www.qaya.org/
    >
    >
    >
    >



    This archive was generated by hypermail 2.1.5 : Tue Jul 20 2004 - 14:16:27 CDT