Re: Yerushala(y)im - or Biblical Hebrew

From: Ted Hopp (
Date: Mon Jul 07 2003 - 11:51:08 EDT

  • Next message: Edward H Trager: "Re: Conversion of MySQL"

    On 07/07/2003 8:52 AM, Peter Kirk wrote:
    > On 06/07/2003 17:22, John Hudson wrote:
    > > ... Given the small number of attested sequences that would be
    > > adversely affected by normalisation re-ordering, I'm beginning to
    > > favour the idea of encoding these sequences as individual characters.
    > > We'd probably only need three or four, plus a right meteg, to solve
    > > the problem, and rendering would work find with existing font and
    > > layout engine technologies.
    > This sounds like a sensible alternative.

    This would make data entry difficult for users. Nobody thinks of these
    character sequences as single characters. Editing would also be an
    "interesting" experience. Could one search for lamed-patah and find it as
    part of lamed-<patah+hiriq>? Or would the proposal be to use these new codes
    only as part of bookend processing around normalization (i.e., automatically
    recognize the sequences and substitute, normalize, and then automatically
    substitute back)?

    I think we need to keep Peter Constable's point in mind that current usage
    should not define the limits of Unicode functionality. Since the principle
    is that all sequences of character codes are permitted (2.10), it seems
    wrong to supply a fix for only "the small number of attested sequences". In
    view of this principle, the current combining class values are at odds with
    definition D46 (combining class; section 3.11) as well as with the
    discussion in 2.10 on multiple combining characters. That is what should be


    Ted Hopp, Ph.D.
    ZigZag, Inc.

    newSLATE is your personal learning workspace
       ...on the web at

    This archive was generated by hypermail 2.1.5 : Mon Jul 07 2003 - 12:43:26 EDT