Re: Yerushala(y)im - or Biblical Hebrew

From: Kenneth Whistler (kenw@sybase.com)
Date: Fri Jul 25 2003 - 15:34:28 EDT

  • Next message: Peter Kirk: "Re: [OT?] LCD/LED Keyboard"

    Tedd Hopp asked:

    > Tell me if I'm wrong please, but isn't moving characters (however
    > it's disguised) as much of a violation of the stability policy as is
    > changing combining classes of the existing vowels?

    You're not wrong. It is a violation of the stability policy.

    > The Hebrew vowels interact typographically and the combining classes should
    > have been assigned accordingly originally. That's what should be fixed now.

    But that is water under the bridge at this point. I am looking for
    a *feasible* technical solution to the current *technical* problem.

    > I recognize the powerful political issues involved, and that these are
    > barriers to this happening. But trying to find technical solutions to
    > political problems is extremely short-sighted. I would urge that all
    > technical efforts be directed away from solving the politicians' problems
    > and focus on how to minimize whatever damage may be caused by changing the
    > combining classes.

    What we have currently are:

      a. a minor technical problem (that certain sequences of vowel
         points in Biblical Hebrew cannot be reliably distinguished
         in normalized Unicode plain text)
         
    and

      b. a minor political problem (that certain communities of Biblical
         scholars are badmouthing Unicode because it "can't fix its
         obvious mistakes")
         
    Changing the combining classes of Hebrew points will create:

      a. a major technical problem (destabilization of normalization)
      
    and

      b. a major political problem (in both IETF and W3C, at least,
         as well as between members in the UTC, with the non-zero
         risk that the rift will result in the definition of competing
         specifications of normalization, which will compound the
         technical problem)
         
    >
    > From my company's perspective, all other proposals I've seen would be more
    > damaging to us than doing the right thing. It would be beneficial to hear
    > from others on this list about what the specific technical (not political)
    > impacts would be (both positive and negative) on their work and their
    > products that would come from fixing the combining classes of the existing
    > vowels.

    Speaking for Sybase products, "fixing" the combining classes of the
    existing vowels would have *no* positive impacts. It would have
    a large number of negative impacts, the ultimate ramifications
    of which I cannot even follow to their eventual conclusions. It
    would impact the implementation of normalization code, as well
    as its testing. It would lead to nasty meetings where I would
    try to explain to server developers why normalization on the
    servers wasn't quite reliable, since it has this little Hebrew
    "hole" over here. It might require figuring out how to specify
    versions of normalization -- and I have no idea how the labels
    for that would be reliably attached to data. It puts normalized
    data into a kind of a Catch-22 situation where you could never
    rely on its stability, since if it contained any of the offending
    characters, the determination of whether it would change or not
    if normalized again would depend on which *version* of the
    normalization code it hit. The reaction, in the context of
    some protocols or even database implementations, might be to
    deny access to the offending characters -- essentially to rule
    them offlimits because of their impact on normalization. And
    how would *that* benefit Biblical Hebrew scholars?

    The whole situation just stinks of the neverending cascade of
    problems that result, for example, from the small set of
    persistent interoperability mismatches between various
    interpretations of Shift-JIS encoding. That's the kind of
    problem that can persist for a decade or more and which just
    gets passed as the hot potato from one generation of
    developers to the next.

    I expect you could hear testimonials from other database
    developers on the list about the evils of destabilizing the
    definition of Unicode normalization.

    --Ken



    This archive was generated by hypermail 2.1.5 : Fri Jul 25 2003 - 16:03:35 EDT