From: Kenneth Whistler (kenw@sybase.com)
Date: Tue Aug 05 2003 - 19:59:06 EDT
Ted Hopp asked:
> I believe that reasonable people might reasonably conclude from factoids 1
> and 2 that SPACE is indeed a format character.
>
> Reasonable, but evidently wrong. Explanation, please?
I provided the text deconstruction in my last email, but to
continue, the confusion arises from the strange nature of
SPACE in the history of character encoding.
SPACE, for a long time now in the history of character encodings,
has been classified as a *graphic* character. Certainly, in
the general SC2 character encoding context of ISO 2022,
SPACE always shows up in the G0 set, with other graphic
characters, instead of in the various control functions
encoded in C0 or C1 sets.
But looked at from the legacy of device control, SPACE
could just as well been categorized as a control function:
MOVE PRINT HEAD ONE UNIT RIGHT, comparable to BACKSPACE.
And in the context of the Unicode Standard, people often
loosely talk about space characters as being format
characters, since they are a) more akin to punctuation than
normal letters, b) have no glyph associated with them,
and c) impact line-breaking and other aspects of the formatting
of characters in their vicinity.
But the *formal* categorization of Unicode characters,
defined by the UTC to help eliminate this kind of
ambiguity in talk about the character types, is spelled
out in Figure 2.5 of Unicode 4.0 now:
http://www.unicode.org/book/preview/ch02.pdf
and the *formal* meaning of "format control character"
(Basic type = "Format") in Unicode is now any character
with the General Category of {Cf, Zl, Zp}.
The space characters are all lumped in with graphic characters.
So while there are still some ambiguities to be worked out
in the definition of "base character" in the Unicode Standard,
neither the status of SPACE as a graphic character nor the
recommendation of the standard that non-spacing marks be
applied to SPACE as a means of showing them in isolation
is in question.
--Ken
This archive was generated by hypermail 2.1.5 : Tue Aug 05 2003 - 20:59:28 EDT