Re: Response to Everson Phoenician and why June 7?

From: Peter Kirk (peterkirk@qaya.org)
Date: Tue May 25 2004 - 19:11:11 CDT

  • Next message: Mike Ayers: "RE: [BULK] - Re: Fraktur Legibility (was Re: Response to Everson Phoenician)"

    On 25/05/2004 12:14, Kenneth Whistler wrote:

    >Peter,
    >
    >
    >
    >>>>There is no consensus that this Phoenician proposal is necessary. I
    >>>>and others have also put forward several mediating positions e.g.
    >>>>separate encoding with compatibility decompositions
    >>>>
    >>>>
    >>>Which was rejected by Ken for good technical reasons.
    >>>
    >>>
    >>I don't remember any technical reasons, it was more a matter of "we
    >>haven't done it this way before".
    >>
    >>
    >
    >The *reason* why we haven't done it this way before is because
    >it would cause technical difficulties.
    >
    >Compatibility decompositions directly impact normalization.
    >
    >

    Understood. I'm not convinced that that is a problem, but I don't insist
    on this.

    >Cross-script equivalencing is done by transliteration algorithms,
    >not by normalization algorithms.
    >
    >

    But you are begging the question by calling this "cross-script".

    >If you try to blur the boundary between those two by introducing
    >compatibility decompositions to equate across separately encoded
    >scripts, the net impact would be to screw up *both* normalization
    >and transliteration by conflating the two. You
    >would end up with confusion among both the implementers of
    >such algorithms and the consumers of them.
    >
    >
    >
    OK.

    >>But perhaps that is only because the
    >>need to do this has not previously been identified.
    >>
    >>
    >
    >No, that is not the case.
    >
    >
    >
    >>However, I can make
    >>a good case for the new Coptic letters being made compatibility
    >>equivalent to Greek - which can still be done, presumably -
    >>
    >>
    >
    >But will not be done. If you attempted to make your case, you
    >would soon discover that even *if* such cross-script equivalencing
    >via compatibility decompositions were a good idea (which it isn't),
    >you would end up with inconsistencies, because some of the Coptic
    >letters would have decompositions and some could not (because they
    >are already in the standard without decompositions). You'd end
    >up with a normalization nightmare (where some normalization
    >forms would fold Coptic and Greek, and other normalization
    >forms would not), while not having a transliteration solution.
    >
    >

    Well, they would fold down to the current Unicode 4.0 situation, as the
    new Coptic letters would fold to the old ones, and the Coptic only
    letters in the old block will not be changed. This would have the great
    advantage that documents still being encoded with the still current
    Coptic encoding will remain compatibility equivalent to new documents.
    Of course it will confuse people who don't know the history, but there
    is plenty of that in Unicode already.

    >The UTC would, I predict, reject such a proposal out of hand.
    >
    >
    >
    >>as well as
    >>for similar equivalences for scripts like Gothic and Old Italic, and
    >>perhaps Indic scripts - which presumably cannot now be added for
    >>stability reasons.
    >>
    >>
    >
    >Correct.
    >
    >
    >
    >>>>and with interleaved collation,
    >>>>
    >>>>
    >>>Which was rejected for the default template (and would go against the
    >>>practices already in place in the default template) but is available
    >>>to you in your tailorings.
    >>>
    >>>
    >>Again, a matter of "we haven't done it this way before".
    >>
    >>
    >
    >I don't like the notion of interleaving in the default weighting
    >table, and have spoken against it, but as John Cowan has pointed
    >out, it is at least feasible. It doesn't have the ridiculousness
    >factor of the compatibility decomposition approach.
    >
    >
    >
    Well, perhaps this is a way of finding an acceptable mediating position
    to put an end to the endless arguments in this thread. It may be a bit
    messy, like most compromises, but as it is feasible it is worthy of
    serious consideration. It should overcome the most serious objections of
    Semitic scholars etc to separate encoding of Phoenician - although I
    can't speak for everyone who has strong views on this subject.

    >>>>also encoding as variation sequences,
    >>>>
    >>>>
    >>>Which was rejected by Ken and others for good technical reasons, not
    >>>the least of which was the p%r%e%p%o%s%t%e%r%o%u%s%n%e%s%s% of
    >>>interleaving Hebrew text in order to get Phoenician glyphs.
    >>>
    >>>
    >>I don't like this one myself either.
    >>
    >>
    >
    >So can we please just drop it?
    >
    >
    >
    With pleasure.

    > ...
    >
    >>So I am
    >>looking for a technical solution which comes somewhere between these two
    >>extremes, which officially recognises the one-to-one equivalence between
    >>Phoenician and (a subset of) Hebrew while making a plain text
    >>distinction possible for those who wish to make it.
    >>
    >>
    >
    >The technical solution for that is:
    >
    >A. Encode Phoenician as a separate script. (That accomplishes the
    > second task, of making a plain text distinction possible.)
    >
    >B. Asserting in the *documentation* that there is a well-known
    > one-to-one equivalence relationship between the letters of
    > this (and other 22CWSA) and Hebrew letters -- including the
    > publication of the mapping tables as proof of concept.
    >
    >

    No, this doesn't go far enough, even for me so almost certainly not for
    others. This is accepting the splitters' case and throwing in a footnote
    in the hope of satisfying the joiners. I would think that the least that
    would be acceptable is default interleaved collation.

    >...
    >
    >>>The technical solutions you have proposed have been inadequate.
    >>>
    >>>
    >>Can you suggest one which is more adequate? Or in fact are you
    >>determined to reject any solution, using doubtful technical arguments
    >>against the details because you have failed to produce convincing
    >>arguments against the principle?
    >>
    >>
    >
    >Michael is correct. But don't expect *him* to provide you with
    >all the nitty-gritty dirt from *inside* the library, OS,
    >database, and application vendors' code, because he isn't that
    >level of implementer.
    >
    >

    Nor am I. He claimed to know enough that the technical solutions were
    inadequate. Well, your position seems to be that, of the three, two are
    technically very difficult and one is feasible. Let's explore further
    the one which is feasible.

    -- 
    Peter Kirk
    peter@qaya.org (personal)
    peterkirk@qaya.org (work)
    http://www.qaya.org/
    


    This archive was generated by hypermail 2.1.5 : Tue May 25 2004 - 19:12:37 CDT