Re: The existing rules for U+FFF9 through to U+FFFC. (spins from Re: Furigana)

From: Peter_Constable@sil.org
Date: Fri Aug 16 2002 - 13:32:16 EDT


On 08/15/2002 06:41:59 AM "William Overington" wrote:

>>In essence, though not formally, U+FFF9..U+FFFC are non-characters as
>>well, and the Unicode "semantics" just tells what programs *may* find
them
>>useful for. Unicode 4.0 editors: it might be a good idea to emphasize
>>the close relationship of this small repertoire with the non-characters.
>
>That is not what the specification says.

William, John knows what he is talking about, and is exactly correct: in
essense, though not formally, FFF9..FFFC are non-characters. No, the
Standard doesn't say that; that's why he said, "not formally". The use
intended by the Standard is, however, exactly comparable to the
non-characters at FDD0..FDEF. If they had been defined in the Standard as
non-characters, the world would not be different in any meaningful way.

>It appears to me that the use of the annotation characters in document
>interchange is never forbidden and is strongly discouraged only where
there
>is no prior agreement between the sender and the receiver, and that that
>strong discouragement is because the content may be misinterpreted
>otherwise. So, if there is a prior agreement, then there is no problem
>about using them in interchanged documents.
>
>There appears to be nothing that suggests that U+FFFC cannot be used in
an
>interchanged document.

Well, you've missed the intent of the authors of the Standard, and appear
not to grasp the mindset. When it says interchange of IA characters may be
OK given prior agreement, what's really in mind is that e.g. I've written
code library A that handles some aspects of interlinear annotation, you've
written code library B that handles different aspects of interlinear
annotation, and we agree on certain interfaces so that my library can call
yours or vice versa, and agree that strings passed by those interfaces can
contain IA characters. That's the kind of thing that's in mind. It does
*not* imply that anyone should consider create a document containing IA
characters.

>I know little about Bliss symbols, though I have seen a few of them and
have
>read a brief introduction to them, yet it seems to me that annotating
Bliss
>symbols with English or Swedish is entirely within the specification
>absolutely and would be no more than strongly discouraged even if there
is
>no prior agreement between the sender and the receiver.

Of course the Standard doesn't discourage anyone from annotating Bliss
symbols with English or Swedish; it only discourages the use of IA
characters as markup in documents.

>Further, it seems to me from the published rules that these annotation
>characters could possibly be used to provide a footnote annotation
facility
>within a plain text file

That would not be a proposal worth pursuing; in fact, I'd say it's a very
bad idea. The reason you DO NOT want to use IA characters in a document is
that you do not know what someone's software will do with them. The
characters have always been intended for use by software programmers, not
by content authors. (Ditto for the object replacement character.)

>An interesting point for consideration is as to whether the following
>sequence is permitted in interchanged documents...

>It seems to me that if that is indeed permissible that it could
potentially
>be a useful facility.

On the whole, it would be very unwise to use these characters in documents
for reasons I explained above. If two people agree to do this, nobody's
going to send the Unicode police to stop them. But very few of us on this
list are particularly interested in what is hypothetically possible for
some pair of us to do. We're far more interested in how widely-used
implementations should and do work, and in such implementations,
FFF9..FFFC are assumed not to be use in content.

- Peter

---------------------------------------------------------------------------
Peter Constable

Non-Roman Script Initiative, SIL International
7500 W. Camp Wisdom Rd., Dallas, TX 75236, USA
Tel: +1 972 708 7485
E-mail: <peter_constable@sil.org>



This archive was generated by hypermail 2.1.2 : Fri Aug 16 2002 - 11:42:03 EDT