From: John H. Jenkins (email@example.com)
Date: Tue Jul 01 2008 - 10:48:16 CDT
I know I'm going to regret wading into this discussion, but...
Unicode's criterion for "plain text" is minimum legibility, the
minimum amount of information necessary to make text intelligible to
the reader. This is explicitly *NOT* the same thing as preserving the
entire semantic content of any specific text.
Now, this is a fuzzy enough line as it is, but it's the reason why
super-/subscripts and italics are omitted. It does mean that some
existing texts cannot be encoded in plain text. It means that plain
text will have to resort to 10^2's of *kludges* to create content
equivalent to what users typically expect to be able to do even with
minimal word processors. It means that you cannot set the Mouse's
Tale from _Alice in Wonderland_ in plain text.
Unicode is intended to be used in an environment where rich text
processing is available to handle the more general problem of
faithfully reproducing any specific text. It is not intended to be a
solution to that problem in and of itself.
I can pretty much guarantee that any proposal to add a "complete" set
of super-/subscripts will get shot down in UTC for three reasons:
1) No human language requires them for minimum legibility.
2) The problem of which characters get to be cloned as super-/
subscripts is a rather nasty one. All accented forms? Just base
letters? Which base letters? All distinct Latin letters? IPA?
Greek? Cyrillic? Why not Hebrew?
3) If the intent is to provide a general solution to super-/subscripts
in actual use, you have to allow for supersubscripts, subsuperscripts,
supersuperscripts, subsubscripts, and so on.
John H. Jenkins
This archive was generated by hypermail 2.1.5 : Tue Jul 01 2008 - 10:52:17 CDT