Re: PH technical issues (was RE: Why Fraktur is irrelevant

From: Peter Kirk (
Date: Sat May 29 2004 - 06:03:23 CDT

  • Next message: Dean Snyder: "Re: PH technical issues (was RE: Why Fraktur is irrelevant"

    On 28/05/2004 17:41, Kenneth Whistler wrote:

    > ...
    >This also struck me as a major misunderstanding in Peter Kirk's
    >note, which may underlie some of the problem this thread has
    >been having in coming to *any* conclusions whatsoever.
    >Take a look at page 343 of the Unicode Standard, which shows a
    >line from the Codex Argenteus in Gothic script. That line is
    >then *transliterated* into the Latin script, and a translation
    >is also given. Taking just the last word, we have the

    (snipped to save bandwidth)

    Thank you, Ken. I understand better what is going on now. I accept your
    argument that Gothic is roughly parallel to Phoenician in this regard,
    although it does not have such a clear one-to-one correspondence with
    any other script (and the closest other script is Greek, not Latin). Of
    course this does not entirely imply that Unicode should treat Phoenician
    in the same way as Gothic, because the separate treatment of Gothic may
    have been a mistake - although that is not my argument now.

    But the Sally and Latisha scenario still begs the question: there is a
    problem of preservation of distinct character semantics only for those
    who presuppose that there is a semantic distinction in the first place,
    and not only a glyphic one. Now I am now more or less convinced that
    there IS a semantic distinction in this sense, and so I am supporting
    separate encoding. What I am NOT convinced of is that that semantic
    distinction is significant enough to be require a completely separate
    script. It seems to me much more like the kind of distinction at the
    glyph level only for which variation sequences were introduced.

    Now I see also good reasons for not encoding plain text Phoenician
    entirely with variation sequences. (Although length of the text is not a
    good reason: the length is unchanged in UTF-16 and increased by only 25%
    in UTF-8.) But we have discussed alternatives such as interleaved
    collation in the default collation table. Does anyone on this list have
    any strong objections to this?


    >And no matter how many times Peter Kirk begs the question of
    >what is a script distinction, what it comes down to in
    >the Unicode Standard is that a script distinction is a
    >distinct encoding of a script, neither more nor less.
    >It does not correlate directly to a graphologist's or
    >palaeographer's definition (if they have one) of what
    >a script is, nor can it be defined, a priori, axiomatically. ...

    Nor for that matter does it correlate directly to the pedagogue's
    definition, or to the ancient Israelite's definition, although both of
    these have also been appealed to in this discussion. But this implies
    that it is meaningless so far to say whether Phoenician is a separate
    script or not in the Unicode sense, because this is purely a matter of
    definition by the UTC and so far the UTC has not made any such
    definition. Of course some could argue that it is not a separate script
    because it is not in the standard, but others could argue that it is
    because it is in the roadmap. But such discussions are pointless; the
    real question is, *should* it be defined as a separate script, in the
    Unicode sense, or not?

    On 28/05/2004 14:58, Peter Constable wrote:

    > ...
    >>Well, if anyone has another scenario to propose, let's see it.
    >Scenario (undesireable):
    >The editor of a UCLA journal on ancient Indo-European linguistics
    >receives submissions from numerous sources for publication in the
    >journal. ...
    >Alternate scenario (desireable):
    >The editor receives submissions as described above. Because Phoenician
    >script and Hebrew script are encoded distinctly, there is never any
    >concern as to how text provided to reviewers will appear. She saves many
    >hours of work both in preparing submissions for reviewers and in final
    >typesetting. Embarrassing errors and the need to publish corrigenda are
    >significantly reduced.
    >Now tell me that's an unrealistic or trivial scenario.
    No, that is a realistic scenario, perhaps because it came from a real
    such editor. Well, she might have saved us many hours of work, and
    embarrassing errors on every side, if she had presented this scenario a
    month or so ago. I agree that in this scenario a plain text distinction
    between Phoenician and Hebrew is desirable. I would be concerned about
    how many additional plain text distinctions could be justified by this
    means, e.g. between different types of Old Italic and Runes as D.
    Starner mentions, even between Fraktur and Antiqua as someone may have
    very deliberately submitted an (e.g. old German) text in Fraktur and
    consider it an error for the paper to be printed with Antiqua glyphs.
    For that matter, in many submissions markup such as italic (for
    emphasis, quoted words etc) is significant and must be preserved, and
    this implies that the editor cannot work with plain text only.
    Nevertheless, I agree that the editor's task will be simplified by a
    plain text distinction between Phoenician and Hebrew, and that this
    scenario is not trivial.

    >I suspect few Semitic paleographers are using MS database products. ...

    And is MS happy about this situation? ;-)

    On 28/05/2004 15:26, Peter Constable wrote:

    >>From: Peter Kirk []
    >>But I was thinking in
    >>terms of tailored collation weights for the Unicode collation
    >And moreover (adding to comments in my previous message), it seems
    >*very* likely to me that, on those occasions when the Semitic
    >paleographer is going to need to fold characters, they're going to deal
    >with it not by UCA tailoring but by converting the Phoenician characters
    >(as they would more often with Latin characters) to Hebrew characters.
    Well, this does not deal with the scenario which I had in mind, and
    clearly presented some time ago, in which users are searching the
    Internet, or some private but extensive collection of texts, for a
    particular word or phrase, in Hebrew or for that matter Moabite etc or
    even Phoenician. Currently such a search would need to match Hebrew
    characters and also a variety of Latin transliterations. (Hopefully over
    time the use of Latin transliterations will fade, or at least become
    more standardised as transliterators can use real Unicode characters
    with diacritics and not ad hoc ASCII-based solutions.) But if Phoenician
    is separately encoded, and at least some palaeo-Hebrew, Moabite etc
    texts are represented with the Phoenician characters, searchers will
    need to search for an additional encoding. For that matter, searchers
    for texts written with Phoenician glyphs will also be inconvenienced
    because some such texts will be represented by Hebrew characters. In
    such a case the user cannot convert all texts to Hebrew characters in
    advance, the folding must be applied by the search engine.

    Is this a realistic scenario? Is it one which really requires folding
    together of Hebrew and Phoenician? What does anyone else think?

    On 28/05/2004 19:06, saqqara wrote:

    >----- Original Message -----
    >From: "Peter Constable" Sent: Friday, May 28, 2004 10:58 PM
    >>>Is it really in the scope of Unicode to encode such trivialities? I
    >>>a key ring with my name "written" in an Egyptian hieroglyphic
    >>>pseudo-alphabet. Will such abuse of Egyptian hieroglyphs have to be
    >>>taken into account in the possible Unicode proposal for this script?
    >>Why is that an abuse of hieroglyphs any more than Hebrew text
    >>transliterated or transcribed in Latin characters, or Arabic text
    >>transcribed in Hangul characters? Unicode is uninterested in what the
    >>content of the text is; it encodes characters, not text. It is up to
    >>users and implementers to decide what texts those characters can
    >>So, absolutely, it is in the scope of Unicode.
    >Just so Peter. These are not trivialities. ...

    OK, I'll accept this one.

    >... Writing of 'foreign' words in the
    >ancient context is not so different to the PtrKrk key ring. ...

    Not "PTRKRK", but "PETER" encoded with a pseudo-alphabet which includes
    vowels, as follows (Gardiner's codes in parentheses):

    reed mat or stool (D15)
    reed (C20)
    bun (D24)
    reed (C20)
    mouth (A38)

    A rather boring set of hieroglyphs, actually. But on the back of the
    keyring I did get a sandal strap (E34) = life and a dung beetle (B67) =
    fortune. I also have a T-shirt with the complete hieroglyphic "alphabet"
    i.e. equivalents for A-Z. These things are widespread in tourist areas
    of Egypt.

    On 28/05/2004 21:40, James Kass wrote:

    >It is respectfully suggested that anyone who is not able to spot
    >the errors on this page...
    > the transliteration (and translation) of the inscription within
    >(let's be gracious here) ten or fifteen seconds, without the aid of
    >an alphabet chart, is not a member of the script's user community.

    Well, I don't claim to be a member of the user community myself. And it
    took me a little bit longer than 10-15 seconds to spot the problem, but
    only about 30 seconds. Basically, one word has been left out of the
    transliteration, and two from the translation. The first line (dropping
    the reconstructions) should read:

    [...]MYT MLK M'B HD[...]

    [...K]emosh-yat, king of Moab, the D[ibonite]

    On 29/05/2004 00:40, James Kass wrote:

    >Peter Kirk wrote,
    >>Well, if anyone has another scenario to propose, let's see it.
    >Chang and Eng debate the merits of the Everson proposal
    >from opposing viewpoints. Eng stabs Chang in a fit of
    >pique, forgetting momentarily that they are joined at
    >the hip. Both die.
    >In this parable, which isn't really responsive to Peter Kirk's
    >request for additional encoding difficulty scenarios, we can
    >see that even Siamese twins have unique identities in spite
    >of, uh, remarkably similar DNA. Even if one were to call
    >such twins "diabrothers", they'd still be individuals.
    A great parable! But how can you distinguish Chang and Eng, who have
    separate personalities but the same DNA and who cannot survive
    separately, from a schizophrenic who has multiple personalities which
    can even argue with one another and kill one another, but also has one
    set of DNA and one life? Counting heads, I suppose, but then (to move to
    fiction) is Zaphod Beeblebrox two people, or Cerberus three dogs? So we
    are back in the territory of close judgment calls, and of trying to find
    a way of formalising a relationship somewhere between identity and
    complete separation.

    Let's change the analogy. Chang and Eng are now the Hebrew and
    Phoenician scripts themselves. Let's accept them as separate
    individuals. But that doesn't mean that we can grab the two of them,
    pull them apart, and put them in entirely separate boxes. That will
    simply kill both of them.

    Peter Kirk (personal) (work)

    This archive was generated by hypermail 2.1.5 : Sat May 29 2004 - 12:35:29 CDT