I don't think this violates the idea of "plain text". Plain text is an
interchange concept, while we are talking about input methods. Once you
wish to allow plain text to include more than the small number of
characters that can be conveniently provided by the keyboard you have to
provide more sophisticated input methods.
Quotation marks are just one case, Unicode contains many more characters an
author may wish to use that are not in his keyboard. Hexadecimal is not a
solution for the general public.
I suggest we take a look at how things used to be done before computers. In
those ancient times, one would give a printer a manuscript (= hand written
paper), which was marked up either by the author or by an editor, and the
printer would set the text in print. This was the grandfather of mark-up
languages, later standardized in SGML.
In those manuscripts, the text could not indicate precisely various
typographic distinctions, such as quotation marks, and in those cases
markup was used.
It is much more user friendly to have to write <Q>text</Q>, or to select
the text and click on a "quotation" menu item, indicating intent, rather
than <&lsqm>text<&rsqm> or something similar, or some fancy keyboard
combination, in which the author has to specify the precise implications of
How will mathematical symbols be entered in plain text?
At 02:23 19/07/99 -0700, Markus Kuhn wrote:
>Jonathan Rosenne wrote on 1999-07-18 22:14 UTC:
>> 1. this is one of the reasons for <Q>text</Q> in HTML. The processor can
>> substitute the correct character.
>> In general, any word processor should allow the user to style the text as a
>> quotation, rather than require him to type typographical characters.
>I personally am not convinced that higher layer protocols should be used
>to handle punctuation. This completely violates by concept of plain
>text, and the existing practice of using higher layer protocols here
>clearly just derives from the limitations of ASCII, an artifact of an
>era that we are hopefully about to leave behind us. Higher layer
>protocols such as SGML are fine for things like font selection and other
>formatting and logical structuring, but quotation marks and other
>punctuation are too much part of the raw text than that I would like to
>see them handled via hacks such as <Q>. Higher layer protocols should in
>my opinion not represent the actual textual content of the text, but
>give only auxiliary structuring and representation hints. Therefore I
>don't like to see markup for quotation marks, just as I don't like the
>idea to have to markup conditional clauses, sentences, and perhaps even
>paragraphs (not sure about the last one though).
>> 2. The situation for Hyphen-Minus is quite similar.
>Agreed, it is equally confusing and keyboard entry conventions should be
>carefully standardized here as well.
>Mark Davis wrote on 1999-07-18 17:47 UTC:
>> There seems to be some misunderstanding. "The Unicode Standard®, Version
>> 2.1" gives the following text (see
>> http://www.unicode.org/unicode/reports/tr8.html#3.6 Apostrophe Semantics
>> U+02BC MODIFIER LETTER APOSTROPHE is preferred where the character
>> is to represent a modifier letter (for example, in transliterations
>> to indicate a glottal stop.) In the latter case, it is also referred
>> to as a letter apostrophe.
>> U+2019 RIGHT SINGLE QUOTATION MARK is preferred where the character is to
>> represent a punctuation mark, as in "We've been here before." In the
>> latter case, U+2019 is also referred to as a punctuation apostrophe.
>Excellent! I missed that 2.1 correction, and I am delighted to see that
>this was already fixed nicely. So U+02BC is one thing less to worry
>about and the Microsoft Word practice actually does conform to the
>standard. Thanks for the reply.
>So the rest is really up to the keyboard standards community to fix.
>Markus G. Kuhn, Computer Laboratory, University of Cambridge, UK
>Email: mkuhn at acm.org, WWW: <http://www.cl.cam.ac.uk/~mgk25/>
This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:50 EDT