Apostrophes, quotation marks, keyboards and typography

From: Markus Kuhn (Markus.Kuhn@cl.cam.ac.uk)
Date: Sun Jul 18 1999 - 11:17:24 EDT


Something that has bothered me for some time:

Unicode features three very important characters that can also very
easily be confused:

  U+0027 APOSTROPHE
  U+02BC MODIFIER LETTER APOSTROPHE
  U+2019 RIGHT SINGLE QUOTATION MARK

Unicode 2.1 suggests that U+0027 be visually clearly distinct
(vertically symmetrical, direction neutral) from U+02BC and U+2019, and
it also declares U+0027 to be less preferable then the other two forms.
Current keyboards have only one single key for both apostrophe and
quotation mark, which is usually associated with U+0027. This follows
old typewriter practice, but is typographically completely outdated.
Software such as Microsoft's Word tries to automatically replace U+0027
with U+2018 and U+2019 on entry. This works sometimes and fails
sometimes, and I see in books and newspaper articles more and more often
two different types of apostrophes intermixed within the same text. It
also seems that while Unicode declares U+02BC to be the recommended
character that should be used as an apostrophe in words such as "isn't",
Microsoft has decided to unify U+02BC and U+2019 and provide only one
single code for both function in CP1252 at position 0x92. In addition,
European keyboard users who have a separate key for acute and grave
accent also use these two keys frequently to misrepresent both quotation
marks and apostrophes, which adds further to the confusion. Old ASCII
versions encouraged even to use grave accent as a left quotation mark
and apostrophe as a right quotation mark, which looks nice with some
fonts and horrible with others (especially those following the
standards).

Somehow, I feel the entire situation has become rather confusing and
leaves something to be desired.

Remarks and suggestions:

At first, I must admit that I have to agree with Microsoft that I see
little reason for not unifying

  U+02BC MODIFIER LETTER APOSTROPHE
  U+2019 RIGHT SINGLE QUOTATION MARK

since the two characters although they are semantically distinct are
graphically indistinguishable in practically all fonts. Keyboard typists
can hardly be expected to select the right character and automatic
smart-quote algorithms also cannot be expected to get this distinction
right reliably. Couldn't Unicode follow Microsoft and just remove the
recommendation that U+02BC be the recommended apostrophe character and
instead give U+2019 the dual meaning that it de-facto has already today?

I addition, I feel that the current ISO 8859 oriented national keyboard
standards are not adequate for modern Unicode-era word processing
practices, as they put obsolete typewriter characters such as U+0027 on
too prominent keys, while they have no key positions for the extremely
frequently needed typesetting characters that are for instance supported
by CP1252 (directional single and double quotes, en and em dashes,
etc.). Software either has to use shaky algorithms to make educated
guesses on which character the user might have meant (such as Word tries
to do), or sequences of ASCII characters are interpreted with new
semantics (such as both TeX and Word do), in order to give typists some
compromise access to these characters.

I think it is urgent time to revise national keyboard standards here. We
really need standardized ways to easily enter say at least

  2018 LEFT SINGLE QUOTATION MARK
  2019 RIGHT SINGLE QUOTATION MARK
  201C LEFT DOUBLE QUOTATION MARK
  201D RIGHT DOUBLE QUOTATION MARK
  2013 EN DASH
  2014 EM DASH

on keyboards for English language users, and corresponding extensions on
other national keyboard standards. This might be a good opportunity to
introduce on US keyboards the Level 2 Select key (AltGr), while on
European keyboards is is probably sufficient to just add appropriate
labels to a number of new Level 2 Select positions.

May be, the folks who blessed us a few years ago with the Windows95 keys
are in a good position to help and start promoting something much more
useful here, towards finally upgrading keyboard layout standards to the
needs of the typographic word processing era?

We could have either commonly agreed entry methods for these characters
(say as a new amendment to ISO 14755), or better even new labeled key
positions.

Opinions and suggestions?

Markus

-- 
Markus G. Kuhn, Computer Laboratory, University of Cambridge, UK
Email: mkuhn at acm.org,  WWW: <http://www.cl.cam.ac.uk/~mgk25/>



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:48 EDT