Re: Yerushala(y)im - or Biblical Hebrew

From: Mark Davis (
Date: Mon Jul 28 2003 - 18:14:48 EDT

  • Next message: Kenneth Whistler: "Re: Yerushala(y)im - or Biblical Hebrew"

    Changing the canonical order is not going to happen. If you want to
    read about the problems that that would cause, there has been plenty
    written about it on this list if you consult the archives.

    ► “Eppur si muove” ◄

    ----- Original Message -----
    From: <>
    To: <>
    Sent: Monday, July 28, 2003 14:25
    Subject: Re: Yerushala(y)im - or Biblical Hebrew

    > > Why can't we just fix the database? :)
    > Because changing the canonical ordering classes (in ways not
    > allowed by the stability policies) breaks the normalization
    > *algorithm* and the expected test results it is tested against.
    > If the "expected test results" are bad data, it shouldn't matter
    then if it
    > is consistent. Are you
    > saying that somewhere there are lots of people who have worked very
    hard to
    > implement
    > Hebrew as it is currently described in Unicode 3 and they would have
    > "start over" if we
    > changed the canonical order? And the biggest fear is that the data
    > won't be
    > consistent with the data in the new order? My point is that there
    *is* no
    > data today,
    > because anyone who has attempted to produce biblical Hebrew data in
    > current
    > canonical order would have stopped and said "Wait a minute! This
    > work".
    > That's what I'm saying. And I have no particular problem with the
    > suggestion, but
    > it doesn't go far enough. I don't think we can use it to fix meteg,
    a mark
    > which occurs in
    > three different positions around a low vowel, yet is canonically
    > before the shin/sin
    > dots! Will we put one CGJ on the right to indicate a right meteg and
    one on
    > the left to indicate
    > a left meteg? There are many other examples of problems with the
    > canonical order.
    > The apparent simplest solution to all the problems is to correct the
    > canonical order.
    > >>Unless you
    > are talking about conversion algorithms for batch conversion of
    > existing Biblical Hebrew repositories into Unicode -- but those
    > are specialized code to begin with, and it is much less impact to
    > ask people to update the tables in those to insert a CGJ into
    > the point sequences than it is to ask all implementers to deal
    > with the consequences of broken normalization.
    > Yes, I am talking about the person writing a batch conversion from
    > data into
    > Unicode. That would be me. If you were only suggesting we insert one
    CGJ, I
    > wouldn't complain.
    > But we are looking at re-writing the font, the keyboards, and the
    > conversion so that we can
    > work around the numerous problems with canonical order. I am
    > preferring that
    > you "normalizers" re-write your code. :)
    > Joan W.

    This archive was generated by hypermail 2.1.5 : Mon Jul 28 2003 - 19:00:35 EDT