RE: How to encode abbreviations [Was: Representative glyphs for combining kannada signs]

From: Keutgen, Walter (
Date: Tue Mar 28 2006 - 04:12:10 CST

  • Next message: "UTF-7"

    Antoine, Doug,

    my understanding of the intend of the Unicode standard is that such superscripting pertains to 'rich text properties'.

    Some superscripts are encoded as characters. For historical reasons e.g. compatibility with existing charcater sets?

    If complaining that U+1D50 and 1D49 do not display, where is the 'l' for 'M<super>lle' and the 's' for the plurals?

    One could add the whole alphabet(s) as superscripts to be sure and, why not, as subscripts as well for other languages.

    In real every day usage, in French 'mechanical typewriting', 'PC typewriting' and *hand writing* one did/does not superscript the endings of the abbreviations 'Mme, Mmes, Mlle, Mlles, Dr, Drs, Ir, Irs'.

    In hand writing one always uses superscripts for ordinal numbers, which is not possible in flat text PC writing and required some fumbling whith the mechanical typewriter. I.e. '1er', '2ème' or '2e' etc require superscripts, likewise the forms derived from the Latin wording '1o, 2o' etc for which one uses the '°'. One also often sees Me (maître = master in law) with a superscripted e. Now as to know whether '°' is a superscripted 'o' or the degree sign, my keyboard does not tell me. I would bed however that the '°' often is smaller than the superscripted e.


    THIS COMMUNICATION MAY CONTAIN CONFIDENTIAL AND/OR OTHERWISE PROPRIETARY MATERIAL and is thus for use only by the intended recipient. If you received this in error, please contact the sender and delete the e-mail and its attachments from all computers.

    -----Original Message-----
    From: [] On Behalf Of Antoine Leca
    Sent: Dienstag, den 28. März 2006 10:20
    To: Unicode Mailing List
    Subject: How to encode abbreviations [Was: Representative glyphs for combining kannada signs]

    Doug Ewell wrote:
    > Antoine Leca <Antoine10646 at leca dash marti dot org> wrote:
    >> Put it in clear: to write the French equivalent of Mrs, I can:
    >> - either write the slightly incorrect Mme
    >> - or write the more "correct" M[][] (where [] represent the empty box
    >> that everybody except four cats will effectively see).
    >> Somewhere I am thinking this is *not* a working solution.
    > So we avoid using rare and -- more importantly -- newly added
    > characters, preferring ASCII fallbacks of the sort Unicode was
    > intended to replace.

    While I agree with your pertinent remark on a general way, in THIS case I
    believe this is not adequate. Those two characters (U+1D50 and U+1D49, ᵐᵉ)
    do not seem to me to be intended for French abbreviations (or any written
    language typographics effects), but rather for phonetics. As a result, it
    seems difficult to me to ask French people to have phonetics-specialized
    fonts, in order to read something as common as the abbreviation for Mrs,
    just because it caught the attention of someone that those characters almost
    fit that particular needs.
    I can be wrong though.

    In fact, while I was too much ironical with my [] description, behind the
    scene there is a real problem about the use of those characters which have
    been added for some specialized purposes, but are reused.
    Of course the re-use of the characters for purposes which were not intended
    from the beginning, while it could be sometimes seen as incorrect or wrong
    by the standardization purists, is a very well known evolution for *every*
    character repertoire standardized to date, whether in the digital era or
    before (I mean, since men invent Writing.)

    However, sometimes the out-of-intent uses are less adequate; in general,
    those unfortunate uses are fading away quickly, probably since they do not
    get a catch; having interoperability problems just limit their uses, of
    In that way, your remark above is somewhat limited: all those new (sparkled)
    characters primarily come into use when they are readilly available for the
    purpose they are introduced first; it is only on a second stance, when the
    necessary infrastructure is in place, that they can be used for different,
    perhaps incorrect uses. One of my prefered is the French use of ° to mark
    the abbreviation of a final o (as in 1º, 2º), and the Spanish use of º to
    mark degree; both characters are in T.61 and derivates, including 8859-1; of
    course, it's the presence of the characters in the keyboard layouts which is
    the root; counter example, or examples of the reverse, are ASCII - and ',
    whose covers several meanings.


    This archive was generated by hypermail 2.1.5 : Tue Mar 28 2006 - 04:22:48 CST