From: Peter Kirk (email@example.com)
Date: Mon Nov 08 2004 - 06:04:02 CST
On 08/11/2004 01:28, Michael Everson wrote:
> At 22:45 +0000 2004-11-07, Peter Kirk wrote:
>> You have indeed stated an intention to encode "significant nodes".
> Yes. Based on the scholarly taxonomy of writing systems.
>> But the official documentation, the Unicode Standard, does not say
>> anything like this.
> Alarm! Alarm! I detect a desire on your part to consider informative,
> explanatory text as normative.
No, my desire is that informative, explanatory text should not give
misinformation and obfuscation. Note that I was not actually arguing for
character encodings to follow this informative text, but for the text to
be changed to match the reality of character encodings and the apparent
decision of the UTC to encode scripts on the basis of significant nodes
rather than semantic distinctions. And I was pointing out that many
people, including an important group of Semitists, who (at least from
your perspective) have misunderstood the situation are only following
what they have read in informative, explanatory text in the Standard.
>> Rather, it states that Unicode encodes "Characters, Not Glyphs", and
>> that "Characters are the abstract representations of the smallest
>> components of written language that have SEMANTIC value" (TUS section
>> 2.2 p.15, my emphasis on "SEMANTIC").
> Yes. ARABIC LETTER SHEEN is a different letter, and a different
> character from SYRIAC LETTER SHIN. DEVANAGARI LETTER KA is a different
> letter, and a different character, from ORIYA LETTER KA. PHOENICIAN
> LETTER NUN is a different letter, and a different character, from
> HEBREW LETTER NUN.
The first two, yes by definition, because they are in the Standard. The
last one, only provisionally because it is subject to an ISO ballot.
>> And, Michael, I think you have agreed with me, and so with many
>> scholars of Semitic languages, that the distinction between
>> corresponding Phoenician and Hebrew letters (like that between
>> corresponding Devanagari and Gujarati letters) is not a semantic one.
> LETTERS differ by semantics. SCRIPTS differ by other criteria WHETHER
> OR NOT TEXT AFFIRMING THIS HAS BEEN WRITTEN INTO THE UNICODE STANDARD
Well, if what the Standard says is different from this, you can hardly
be surprised that people are confused. One national representative on
WG2 wrote to me offlist (with a copy to you, Michael, and several
others) suggesting that I was doing something morally wrong in rejecting
a semantic distinction between Hebrew and Phoenician. I replied telling
him that you too consider the distinction not semantic. I haven't heard
any more from him.
>> The conclusion we reach from reading the Standard is that these
>> distinctions are glyph distinctions and so should not be encoded.
> You're wrong. You ignore the historical node-based distinctions which
> differentiate the Indic scripts one from the other, and which apply
> equally to Phoenician and Hebrew. And no, Fraktur and Sütterlin are
> not the same sort of thing.
The Standard ignores the historical node-based distinctions. I was
trying to follow the Standard.
>> If it is indeed the position of the UTC that corresponding characters
>> in "significant node" scripts should be encoded despite the lack of
>> semantic distinctiveness,
> This is YOUR requisite.
Is this the position of the UTC? Or does the UTC hold that your
"significant node" scripts are semantically distinct, although you
disagree? Or does the UTC not in fact accept your principle that
"significant node" scripts should be encoded, despite their decision on
Phoenician? Perhaps this should be clarified first.
>> I would like to suggest an amendment to the standard to make this
>> principle clear. This would of course have to be agreed with WG2.
>> Until such an amendment has been put in place, there will continue to
>> be opposition to encoding of any new scripts which do not show clear
>> semantic distinctiveness and so appear to be in breach of the
>> principles of the Standard.
> You're mistaken in your application of the concept of "semantic
> distinctiveness" with regard to script identity.
Well, I thought we were agreed on at least one thing, that the
distinction between Phoenician and Hebrew should not be described as
"semantic distinctiveness". And since, according to an informative part
of the Standard, p.15, "semantic value" is the only criterion for a
distinct character, it is hardly surprising that people are confused.
-- Peter Kirk firstname.lastname@example.org (personal) email@example.com (work) http://www.qaya.org/
This archive was generated by hypermail 2.1.5 : Mon Nov 08 2004 - 08:32:32 CST