From: Peter Kirk (firstname.lastname@example.org)
Date: Mon Nov 08 2004 - 07:54:30 CST
On 08/11/2004 12:47, Michael Everson wrote:
> At 12:04 +0000 2004-11-08, Peter Kirk wrote:
>> No, my desire is that informative, explanatory text should not give
>> misinformation and obfuscation. Note that I was not actually arguing
>> for character encodings to follow this informative text, but for the
>> text to be changed to match the reality of character encodings and
>> the apparent decision of the UTC
> No different decision has been made by the UTC or WG2. I have tried to
> put into words
>> to encode scripts on the basis of significant nodes rather than
>> semantic distinctions.
> SCRIPTS ARE NOT DISTINGUISHED ONE FROM ANOTHER ON BASIC OF "SEMANTICS".
Understood. But if this is true, and agreed by the UTC, it should be
expressed clearly in the Standard, if only to avoid the continuing
confusion and the danger of further long discussions like this one.
>> And I was pointing out that many people, including an important group
>> of Semitists, who (at least from your perspective) have misunderstood
>> the situation are only following what they have read in informative,
>> explanatory text in the Standard.
> Peter, we have tried to explain this to your particular group of
> Semiticists, important or not, and you none of you have actually heard
> what we have said, because you don't WANT to, I suspect. ...
Perhaps in part, but also because we consider the Unicode Standard to be
more authoritative than statements by you or any other individual. And
there does seem to be some contradiction.
>>> ARABIC LETTER SHEEN is a different letter, and a different character
>>> from SYRIAC LETTER SHIN. DEVANAGARI LETTER KA is a different letter,
>>> and a different character, from ORIYA LETTER KA. PHOENICIAN LETTER
>>> NUN is a different letter, and a different character, from HEBREW
>>> LETTER NUN.
>> The first two, yes by definition, because they are in the Standard.
>> The last one, only provisionally because it is subject to an ISO ballot.
> No, in the real world, these things are different. We encode them
> because they ARE different. They are not different because we encoded
Well, this is the nub of the issue. You consider that they are different
letters. Therefore you encode them. Fair enough. But there is no
semantic distinction, we agree. And according to the Unicode Standard as
it stands, that implies that they should not be distinct characters. So,
first you should amend the Standard to match reality as you perceive it,
and then encode according to the Standard.
Nevertheless, while PHOENICIAN LETTER NUN may be a different letter, it
is not a Unicode character, different or otherwise, until it has been
accepted in the ISO ballot, and presumably confirmed by the UTC.
>> Is this the position of the UTC? Or does the UTC hold that your
>> "significant node" scripts are semantically distinct, although you
>> disagree? Or does the UTC not in fact accept your principle that
>> "significant node" scripts should be encoded, despite their decision
>> on Phoenician? Perhaps this should be clarified first.
> I, an expert in the world's writing systems, who has worked for a
> decade encoding scripts and managing the roadmap and all, have tried
> to explain that we have always been encoding according to the science
> of the study of the world's writing systems. Whether the text of the
> standard supplies enough text to make you comfortable or not is not a
> priority for me. Perhaps Ken Whistler and I, in our abundant spare
> time, might try to wordsmith the standard with regard to this issue.
> But your insistence that some legalistic interpretation of that text
> will determine what is and what is not a script is tiresome.
You will very likely save yourselves time later, as you might have saved
yourself long arguments before now about Phoenician, if you invest a
little time in amending the standard. I hardly consider it legalistic to
insist that characters are encoded into the Standard according to the
principle laid down in the Standard. A Standard which is internally
inconsistent is of no help to anyone.
-- Peter Kirk email@example.com (personal) firstname.lastname@example.org (work) http://www.qaya.org/
This archive was generated by hypermail 2.1.5 : Mon Nov 08 2004 - 08:35:29 CST