Re: ASCII fallbacks for Unicode characters

From: Jonathan Coxhead (jonathan@doves.demon.co.uk)
Date: Wed Aug 18 1999 - 21:12:56 EDT


   Markus Kuhn <Markus.Kuhn@cl.cam.ac.uk> wrote ...

 | I'd like to put together an ASCII fallback table that does exactly
 | that for at least the most frequently needed characters for which we
 | do use fallbacks in daily life already. Here is a start:
 |
 | [...]

   This has something in common with the "Atomic Theory" I wrote up
last month. I systematically went through all the "Western" characters
in Unicode (Latin, Greek, Russian and symbolic) and decomposed them
into a simpler set of characters modified by "presentation
suggestions", with the idea that unsophisticated renderers could ignore
the presentation suggestions and get legible results. I was more
concerned with the abstract semantics of the characters than in their
visual appearance, but there are many points of similarity between your
table and mine: in particular

 "..." <- 0x2026 HORIZONTAL ELLIPSIS
 "^" <- 0x02C6 MODIFIER LETTER CIRCUMFLEX ACCENT
 "S" <- 0x0160 LATIN CAPITAL LETTER S WITH CARON
 "OE" <- 0x0152 LATIN CAPITAL LIGATURE OE
 "Z" <- 0x017D LATIN CAPITAL LETTER Z WITH CARON
 "~" <- 0x02DC SMALL TILDE
 "TM" <- 0x2122 TRADE MARK SIGN
 "s" <- 0x0161 LATIN SMALL LETTER S WITH CARON
 "oe" <- 0x0153 LATIN SMALL LIGATURE OE
 "z" <- 0x017E LATIN SMALL LETTER Z WITH CARON
 "Y" <- 0x0178 LATIN CAPITAL LETTER Y WITH DIAERESIS
 "/" <- 0x2215 DIVISION SLASH
 "<<" <- 0x226A MUCH LESS-THAN
 ">>" <- 0x226B MUCH GREATER-THAN

and I also proposed entries like

 "1/2" <- 0x00BD VULGAR FRACTION ONE-HALF

etc. The details are at <http://www.doves.demon.co.uk/atomic.html>,
though not at present in a very algorithm-friendly form, and not
involving modifier letters yet. I intend to rectify both omissions.

        /|
 o o o (_|/
        /|
       (_/



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:51 EDT