Re: Yerushala(y)im - or Biblical Hebrew

From: Peter Kirk (
Date: Tue Jul 08 2003 - 05:23:59 EDT

  • Next message: Philippe Verdy: "Re: [OT] When is a character a currency sign?"

    On 07/07/2003 19:23, John Hudson wrote:

    > At 08:51 07/07/2003, Ted Hopp wrote:
    >> Editing would also be an
    >> "interesting" experience. Could one search for lamed-patah and find
    >> it as
    >> part of lamed-<patah+hiriq>? Or would the proposal be to use these
    >> new codes
    >> only as part of bookend processing around normalization (i.e.,
    >> automatically
    >> recognize the sequences and substitute, normalize, and then
    >> automatically
    >> substitute back)?
    > I suppose the latter is feasible. I am very keen that *any* solution
    > should be invisible to the user.

    Would it work to define a new character, for example, for patah-hiriq
    which has a canonical decomposition into patah plus hiriq, or even into
    hiriq plus patah? Would normalisation compose a patah-hiriq sequence
    into this character and so get round the reordering problem? Remember
    that the reverse sequence is actually not attested, as far as I can tell
    for any of the sequences in question.

    >> I think we need to keep Peter Constable's point in mind that current
    >> usage
    >> should not define the limits of Unicode functionality. Since the
    >> principle
    >> is that all sequences of character codes are permitted (2.10), it seems
    >> wrong to supply a fix for only "the small number of attested sequences".
    But I agree here. The kind of solution I have just proposed is in danger
    of escalating in the way in which the number of Latin characters
    escalated until a decision was made not to add any more.

    Peter Kirk

    This archive was generated by hypermail 2.1.5 : Tue Jul 08 2003 - 06:10:04 EDT