Re: PH technical issues (was RE: Why Fraktur is irrelevant

From: Peter Kirk (peterkirk@qaya.org)
Date: Sat May 29 2004 - 06:03:23 CDT

Next message: Dean Snyder: "Re: PH technical issues (was RE: Why Fraktur is irrelevant"

Previous message: John D. Burger: "Re: base16k - Efficient Binary Data Encoding in Unicode Text"
In reply to: Kenneth Whistler: "RE: PH technical issues (was RE: Why Fraktur is irrelevant"
Next in thread: Dean Snyder: "Re: PH technical issues (was RE: Why Fraktur is irrelevant"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

On 28/05/2004 17:41, Kenneth Whistler wrote:

> ...
>
>This also struck me as a major misunderstanding in Peter Kirk's
>note, which may underlie some of the problem this thread has
>been having in coming to *any* conclusions whatsoever.
>
>Take a look at page 343 of the Unicode Standard, which shows a
>line from the Codex Argenteus in Gothic script. That line is
>then *transliterated* into the Latin script, and a translation
>is also given. Taking just the last word, we have the
>Gothic:
>
>

(snipped to save bandwidth)

Thank you, Ken. I understand better what is going on now. I accept your
argument that Gothic is roughly parallel to Phoenician in this regard,
although it does not have such a clear one-to-one correspondence with
any other script (and the closest other script is Greek, not Latin). Of
course this does not entirely imply that Unicode should treat Phoenician
in the same way as Gothic, because the separate treatment of Gothic may
have been a mistake - although that is not my argument now.

But the Sally and Latisha scenario still begs the question: there is a
problem of preservation of distinct character semantics only for those
who presuppose that there is a semantic distinction in the first place,
and not only a glyphic one. Now I am now more or less convinced that
there IS a semantic distinction in this sense, and so I am supporting
separate encoding. What I am NOT convinced of is that that semantic
distinction is significant enough to be require a completely separate
script. It seems to me much more like the kind of distinction at the
glyph level only for which variation sequences were introduced.

Now I see also good reasons for not encoding plain text Phoenician
entirely with variation sequences. (Although length of the text is not a
good reason: the length is unchanged in UTF-16 and increased by only 25%
in UTF-8.) But we have discussed alternatives such as interleaved
collation in the default collation table. Does anyone on this list have
any strong objections to this?

...

>And no matter how many times Peter Kirk begs the question of
>what is a script distinction, what it comes down to in
>the Unicode Standard is that a script distinction is a
>distinct encoding of a script, neither more nor less.
>It does not correlate directly to a graphologist's or
>palaeographer's definition (if they have one) of what
>a script is, nor can it be defined, a priori, axiomatically. ...
>
>

Nor for that matter does it correlate directly to the pedagogue's
definition, or to the ancient Israelite's definition, although both of
these have also been appealed to in this discussion. But this implies
that it is meaningless so far to say whether Phoenician is a separate
script or not in the Unicode sense, because this is purely a matter of
definition by the UTC and so far the UTC has not made any such
definition. Of course some could argue that it is not a separate script
because it is not in the standard, but others could argue that it is
because it is in the roadmap. But such discussions are pointless; the
real question is, *should* it be defined as a separate script, in the
Unicode sense, or not?

On 28/05/2004 14:58, Peter Constable wrote:

> ...
>
>>Well, if anyone has another scenario to propose, let's see it.
>>
>>
>
>Fine.
>
>Scenario (undesireable):
>
>The editor of a UCLA journal on ancient Indo-European linguistics
>receives submissions from numerous sources for publication in the
>journal. ...
>
>
>Alternate scenario (desireable):
>
>The editor receives submissions as described above. Because Phoenician
>script and Hebrew script are encoded distinctly, there is never any
>concern as to how text provided to reviewers will appear. She saves many
>hours of work both in preparing submissions for reviewers and in final
>typesetting. Embarrassing errors and the need to publish corrigenda are
>significantly reduced.
>
>
>Now tell me that's an unrealistic or trivial scenario.
>
>
>
No, that is a realistic scenario, perhaps because it came from a real
such editor. Well, she might have saved us many hours of work, and
embarrassing errors on every side, if she had presented this scenario a
month or so ago. I agree that in this scenario a plain text distinction
between Phoenician and Hebrew is desirable. I would be concerned about
how many additional plain text distinctions could be justified by this
means, e.g. between different types of Old Italic and Runes as D.
Starner mentions, even between Fraktur and Antiqua as someone may have
very deliberately submitted an (e.g. old German) text in Fraktur and
consider it an error for the paper to be printed with Antiqua glyphs.
For that matter, in many submissions markup such as italic (for
emphasis, quoted words etc) is significant and must be preserved, and
this implies that the editor cannot work with plain text only.
Nevertheless, I agree that the editor's task will be simplified by a
plain text distinction between Phoenician and Hebrew, and that this
scenario is not trivial.

>...
>
>I suspect few Semitic paleographers are using MS database products. ...
>
>

And is MS happy about this situation? ;-)

On 28/05/2004 15:26, Peter Constable wrote:

>>From: Peter Kirk [mailto:peterkirk@qaya.org]
>>
>>
>
>
>
>>But I was thinking in
>>terms of tailored collation weights for the Unicode collation
>>
>>
>algorithm.
>
>And moreover (adding to comments in my previous message), it seems
>*very* likely to me that, on those occasions when the Semitic
>paleographer is going to need to fold characters, they're going to deal
>with it not by UCA tailoring but by converting the Phoenician characters
>(as they would more often with Latin characters) to Hebrew characters.
>
>
>
Well, this does not deal with the scenario which I had in mind, and
clearly presented some time ago, in which users are searching the
Internet, or some private but extensive collection of texts, for a
particular word or phrase, in Hebrew or for that matter Moabite etc or
even Phoenician. Currently such a search would need to match Hebrew
characters and also a variety of Latin transliterations. (Hopefully over
time the use of Latin transliterations will fade, or at least become
more standardised as transliterators can use real Unicode characters
with diacritics and not ad hoc ASCII-based solutions.) But if Phoenician
is separately encoded, and at least some palaeo-Hebrew, Moabite etc
texts are represented with the Phoenician characters, searchers will
need to search for an additional encoding. For that matter, searchers
for texts written with Phoenician glyphs will also be inconvenienced
because some such texts will be represented by Hebrew characters. In
such a case the user cannot convert all texts to Hebrew characters in
advance, the folding must be applied by the search engine.

Is this a realistic scenario? Is it one which really requires folding
together of Hebrew and Phoenician? What does anyone else think?

On 28/05/2004 19:06, saqqara wrote:

>----- Original Message -----
>From: "Peter Constable" Sent: Friday, May 28, 2004 10:58 PM
>
>
>
>>>Is it really in the scope of Unicode to encode such trivialities? I
>>>
>>>
>>have
>>
>>
>>>a key ring with my name "written" in an Egyptian hieroglyphic
>>>pseudo-alphabet. Will such abuse of Egyptian hieroglyphs have to be
>>>taken into account in the possible Unicode proposal for this script?
>>>
>>>
>>Why is that an abuse of hieroglyphs any more than Hebrew text
>>transliterated or transcribed in Latin characters, or Arabic text
>>transcribed in Hangul characters? Unicode is uninterested in what the
>>content of the text is; it encodes characters, not text. It is up to
>>users and implementers to decide what texts those characters can
>>represent.
>>
>>So, absolutely, it is in the scope of Unicode.
>>
>>
>>
>
>Just so Peter. These are not trivialities. ...
>

OK, I'll accept this one.

>... Writing of 'foreign' words in the
>ancient context is not so different to the PtrKrk key ring. ...
>

Not "PTRKRK", but "PETER" encoded with a pseudo-alphabet which includes
vowels, as follows (Gardiner's codes in parentheses):

reed mat or stool (D15)
reed (C20)
bun (D24)
reed (C20)
mouth (A38)

A rather boring set of hieroglyphs, actually. But on the back of the
keyring I did get a sandal strap (E34) = life and a dung beetle (B67) =
fortune. I also have a T-shirt with the complete hieroglyphic "alphabet"
i.e. equivalents for A-Z. These things are widespread in tourist areas
of Egypt.

On 28/05/2004 21:40, James Kass wrote:

>It is respectfully suggested that anyone who is not able to spot
>the errors on this page...
>http://www.kchanson.com/ANCDOCS/westsem/elkerak.html
>...in the transliteration (and translation) of the inscription within
>(let's be gracious here) ten or fifteen seconds, without the aid of
>an alphabet chart, is not a member of the script's user community.
>
>

Well, I don't claim to be a member of the user community myself. And it
took me a little bit longer than 10-15 seconds to spot the problem, but
only about 30 seconds. Basically, one word has been left out of the
transliteration, and two from the translation. The first line (dropping
the reconstructions) should read:

[...]MŠYT MLK M'B HD[...]

[...K]emosh-yat, king of Moab, the D[ibonite]

On 29/05/2004 00:40, James Kass wrote:

>Peter Kirk wrote,
>
>
>
>>Well, if anyone has another scenario to propose, let's see it.
>>
>>
>
>Chang and Eng debate the merits of the Everson proposal
>from opposing viewpoints. Eng stabs Chang in a fit of
>pique, forgetting momentarily that they are joined at
>the hip. Both die.
>
>In this parable, which isn't really responsive to Peter Kirk's
>request for additional encoding difficulty scenarios, we can
>see that even Siamese twins have unique identities in spite
>of, uh, remarkably similar DNA. Even if one were to call
>such twins "diabrothers", they'd still be individuals.
>
>
>
>
A great parable! But how can you distinguish Chang and Eng, who have
separate personalities but the same DNA and who cannot survive
separately, from a schizophrenic who has multiple personalities which
can even argue with one another and kill one another, but also has one
set of DNA and one life? Counting heads, I suppose, but then (to move to
fiction) is Zaphod Beeblebrox two people, or Cerberus three dogs? So we
are back in the territory of close judgment calls, and of trying to find
a way of formalising a relationship somewhere between identity and
complete separation.

Let's change the analogy. Chang and Eng are now the Hebrew and
Phoenician scripts themselves. Let's accept them as separate
individuals. But that doesn't mean that we can grab the two of them,
pull them apart, and put them in entirely separate boxes. That will
simply kill both of them.

-- 
Peter Kirk
peter@qaya.org (personal)
peterkirk@qaya.org (work)
http://www.qaya.org/

Next message: Dean Snyder: "Re: PH technical issues (was RE: Why Fraktur is irrelevant"
Previous message: John D. Burger: "Re: base16k - Efficient Binary Data Encoding in Unicode Text"
In reply to: Kenneth Whistler: "RE: PH technical issues (was RE: Why Fraktur is irrelevant"
Next in thread: Dean Snyder: "Re: PH technical issues (was RE: Why Fraktur is irrelevant"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.5 : Sat May 29 2004 - 12:35:29 CDT