From: Edward H. Trager (firstname.lastname@example.org)
Date: Mon Nov 08 2004 - 14:06:44 CST
So it appears that this thread which some feared as possibly "off topic"
has resulted in at least two positive outcomes, or I think it has.
==> One, we have had some introductions to at least some of the people on
the list. Not a statistical sampling, but very informative nonetheless.
==> Two, I think this thread makes it all too clear that Unicode Consortium
members (and all those on this list who might wish to influence the outcome
of things) need to *thoughtfully consider* the concept not only of
"significant nodes" for scripts in a historical context, but also
--some might see this as far-fetched-- in a future context as well.
Rather than dredge up the Phonecian vs. Hebrew debates, hopefully the
respective experts here can begin to focus more on the "big picture"
of how human scripts --as with all things-- change over time.
What changes in concepts and in the wording of concepts presented in
the latest version of the Unicode Standard should be made to clearly
convey the idea of the continuous evolution of human scripts?
How does the Unicode Standard effectively address the realities of
what one might, for lack of a better term at the moment, refer to as the
"punctuated equilbria" of human script evolution?
How is, or is not, Unicode going to support the scholarly endeavours
of researchers who study ancient scripts and the history of those
Human languages and human scripts are not static as we all know by examining
history, and they won't somehow become static in the future just because
of recognized standards like Unicode. If we want Unicode to last a long, long
time as a meaningful standard --at least hundreds if we cannot imagine thousands
of years-- then these things need to be considered.
While the Unicode code space is by definition mathematically finite, still it is
for all practical intents and purposes a very large code space that should be
able to incorporate the "legitimate needs" of scholars, researchers, historians,
among others. Regardless of whether one agrees completely or not about the encoding
of Phoenecian in Unicode, I --perhaps naively I admit-- fail to see how it does
any more harm than the encoding of that HUGE number of "CJK Unified Ideographs
Extension B" which, as far as I can tell (given my lack of scholarship in this area),
is of more use to esoteric scholars
than it is to ordinary speakers and writers of Chinese, Japanese, or Korean.
It is no worse than the encoding of a large number of Arabic ligatures --a clear
case of encoding glyphs, not characters-- that occurred in Unicode to support legacy
systems that had already been defined for Arabic at the time when Unicode came around.
Thankfully a similar thing did not happen for, say, Syriac. It is no worse than
the encoding of Hangul syllables.
I don't closely follow what additional planes of Unicode are being designated
for, but perhaps there should be a plane set aside for the encoding of historical
"script nodes" that would be useful to scholars, but not as useful to others.
Then again, perhaps I'm too naive in this area to know what I'm talking about ... ;-)
- Ed Trager
> >>You have indeed stated an intention to encode "significant nodes".
> >Yes. Based on the scholarly taxonomy of writing systems.
> >>But the official documentation, the Unicode Standard, does not say
> >>anything like this.
> >Alarm! Alarm! I detect a desire on your part to consider informative,
> >explanatory text as normative.
> No, my desire is that informative, explanatory text should not give
> misinformation and obfuscation. Note that I was not actually arguing for
> character encodings to follow this informative text, but for the text to
> be changed to match the reality of character encodings and the apparent
> decision of the UTC to encode scripts on the basis of significant nodes
> rather than semantic distinctions. And I was pointing out that many
> people, including an important group of Semitists, who (at least from
> your perspective) have misunderstood the situation are only following
> what they have read in informative, explanatory text in the Standard.
> >>Rather, it states that Unicode encodes "Characters, Not Glyphs", and
> >>that "Characters are the abstract representations of the smallest
> >>components of written language that have SEMANTIC value" (TUS section
> >>2.2 p.15, my emphasis on "SEMANTIC").
> >Yes. ARABIC LETTER SHEEN is a different letter, and a different
> >character from SYRIAC LETTER SHIN. DEVANAGARI LETTER KA is a different
> >letter, and a different character, from ORIYA LETTER KA. PHOENICIAN
> >LETTER NUN is a different letter, and a different character, from
> >HEBREW LETTER NUN.
> The first two, yes by definition, because they are in the Standard. The
> last one, only provisionally because it is subject to an ISO ballot.
> >>And, Michael, I think you have agreed with me, and so with many
> >>scholars of Semitic languages, that the distinction between
> >>corresponding Phoenician and Hebrew letters (like that between
> >>corresponding Devanagari and Gujarati letters) is not a semantic one.
> >LETTERS differ by semantics. SCRIPTS differ by other criteria WHETHER
> >OR NOT TEXT AFFIRMING THIS HAS BEEN WRITTEN INTO THE UNICODE STANDARD
> Well, if what the Standard says is different from this, you can hardly
> be surprised that people are confused. One national representative on
> WG2 wrote to me offlist (with a copy to you, Michael, and several
> others) suggesting that I was doing something morally wrong in rejecting
> a semantic distinction between Hebrew and Phoenician. I replied telling
> him that you too consider the distinction not semantic. I haven't heard
> any more from him.
> >>The conclusion we reach from reading the Standard is that these
> >>distinctions are glyph distinctions and so should not be encoded.
> >You're wrong. You ignore the historical node-based distinctions which
> >differentiate the Indic scripts one from the other, and which apply
> >equally to Phoenician and Hebrew. And no, Fraktur and Sütterlin are
> >not the same sort of thing.
> The Standard ignores the historical node-based distinctions. I was
> trying to follow the Standard.
> >>If it is indeed the position of the UTC that corresponding characters
> >>in "significant node" scripts should be encoded despite the lack of
> >>semantic distinctiveness,
> >This is YOUR requisite.
> Is this the position of the UTC? Or does the UTC hold that your
> "significant node" scripts are semantically distinct, although you
> disagree? Or does the UTC not in fact accept your principle that
> "significant node" scripts should be encoded, despite their decision on
> Phoenician? Perhaps this should be clarified first.
> >>I would like to suggest an amendment to the standard to make this
> >>principle clear. This would of course have to be agreed with WG2.
> >>Until such an amendment has been put in place, there will continue to
> >>be opposition to encoding of any new scripts which do not show clear
> >>semantic distinctiveness and so appear to be in breach of the
> >>principles of the Standard.
> >You're mistaken in your application of the concept of "semantic
> >distinctiveness" with regard to script identity.
> Well, I thought we were agreed on at least one thing, that the
> distinction between Phoenician and Hebrew should not be described as
> "semantic distinctiveness". And since, according to an informative part
> of the Standard, p.15, "semantic value" is the only criterion for a
> distinct character, it is hardly surprising that people are confused.
> Peter Kirk
> email@example.com (personal)
> firstname.lastname@example.org (work)
This archive was generated by hypermail 2.1.5 : Mon Nov 08 2004 - 13:39:16 CST