Re: Courtyard Codes and the Private Use Area

From: William Overington (WOverington@ngo.globalnet.co.uk)
Date: Mon May 27 2002 - 07:07:00 EDT


Response to the comments of Mr Curtis Clark.

> I'm sorry, but I can't tell whether you are being intentionally
contrarian or simply dense.

Certainly not the former and hopefully not the latter.

Suggesting a choice of but two personality factors to readers in a
scientific discussion seems strange, but, as you have raised the matter of
personality, I mention the Myers Briggs Type Indicator.

There is a lot of information about Myers Briggs available on the web.

A key point about the Myers Briggs Type Indicator, based on Jungian
psychology, is that people each have a personality type, and that a person's
personality can influence the way that that person views the actions of
someone else, when the way of looking at the world of that other person is
very different from the way of looking at the world that the first person
has. The Myers Briggs Type Indicator is a very important concept for
managers to understand, if only to the extent of recognizing the existence
of a variety of personality types where all of them are quite acceptable
personalities. It is a fascinating subject and I have learned a lot from
it: I feel that I understand people much better now.

> To say that Unicode does not provide
the basis for <em>markup</em> is the same as saying that Unicode does not
provide the basis for English or C++.

Well, I feel that it is not the same thing at all. If a document with an
English poem has a U+0045 in it, then it is displayed as letter E every
time, in accordance with the Unicode specification. If a file of markup has
a U+0045 in it, then maybe it is displayed as a letter E in accordance with
the Unicode specification, or maybe it is used to signal something else,
something that is not in the Unicode specification. In markup, they use a <
character to produce a sort of a bubble with its own, non-Unicode Private
Use Area, that non-Unicode Private Use Area placed right on top of standard
Unicode tables!

I have only a little knowledge of C++ so I shall not comment upon that.

> XML is <em>explicitly</em> based on Unicode. And I have not a clue as to
what you mean by a "non-Unicode file format" in this context.</p>

There, your own evidence! "based on". I disagreed with your claim that
Unicode .... provides the basis for a .... system of markup ... by providing
(list of characters). I did not claim that third parties do not themselves
base some system of their own upon Unicode. This is important. You made
your claim as if it refutes my ideas, yet your claim is not about the same
thing at all. Courtyard Codes, either in the Private Use Area or promoted
to regular Unicode would not be in the same category as HTML in relation to
Unicode, for Courtyard Codes would not produce a bubble that would affect
the way that subsequent characters are handled by the software.

The clear distinction is that the Unicode specification does not provide the
< character as a means of entering a bubble where following characters mean
something that is not in the Unicode specification. A "non-Unicode file
format" is simply a file where the contents are not all characters having
the meaning specified in the Unicode specification. An HTML file is clearly
in a non-Unicode file format, because it uses the < character to enter a
bubble of its own with its own Private Use Area that overrides the meanings
of the Unicode specification within the bubble!

Response to the comments of Mr Michael Everson.

> Of course it does.

No "of course" about it. Say why if you choose. Unicode does not provide
the basis for markup. The Unicode specification shows how characters are to
be displayed. Markup does not display a < character properly in accordance
with the Unicode specification and, having received a < character, does not
display following characters properly in accordance with the Unicode
specification until a > character, which it also does not display in
accordance with the Unicode specification, is received.

>>Character U+003C is LESS-THAN SIGN
>>Character U+003E is GREATER-THAN SIGN
>>Character U+002F is SOLIDUS

>Yup. And those characters form a widely-used and standardized formal system
of markup, which is written using these Unicode characters.

Unicode might well be used for producing markup by some end users, but that
is not the same as the claim made originally that Unicode provides the basis
for markup, which claim was made as if justification for claiming my ideas
as not being good.

William Overington

27 May 2002



This archive was generated by hypermail 2.1.2 : Mon May 27 2002 - 05:33:47 EDT