RE: How to encode abbreviations [Was: Representative glyphs for combining kannada signs]

From: Keutgen, Walter (
Date: Thu Mar 30 2006 - 10:30:32 CST

  • Next message: Keutgen, Walter: "RE: How to encode abbreviations [Was: Representative glyphs for combining kannada signs]"


    I tend to attribute the wrong author to some quotations. Probably an effect of missing eye guidance in flat text.
    So first I apologize for this.

    We two almost agree and, as the message thread necessarily branches, I would like to summarize my opinion,
    also in order that the people intervening do not attribute opinions that I do not have to me.

    Regarding "superscript o" and the use of degree sign instead I will respond separately.

    My opinion is:

    About "Mme", "Melle", "Mgr", "Dr", "Ir"

    1. Superscripts in the abbreviations are not *necessary* in French for "decent" printing.
       Somebody, not you, used "decent" in the thread.
    2. You provide the best argumentation for this below.
       The abbreviations are so strictly codified that everybody identifies them as abbreviations and
       knows at least the current ones.
       Somebody not knowing one of them will not be helped by superscripting.
    3. For *elegant* printing e.g. in wedding invitations and the like, one might opt for superscripting.
    4. The coding for the superscripting should be at a higher level than the character level,
       what I called "rich text" i.e. not the object of the Unicode standard.
    5. In no case should one use characters of the "IPA extensions" or
       any else that one could find in the Unicode character set for this.

    About ordinal numbers (1er, 2ème etc.)

    a. The endings should be superscripted for "decent" printing.
       In order that the reader more easily recognizes the number.
    b. The Unicode standard does not support these superscripts either.
    c. One cannot blame Unicode for this, because neither ISO-8859
       nor the keyboard designers did anything for this.
    d. For flat text the computer has even lost the ability of the
       mechanical typewriter of turning the drum by one tooth.
    e. The computer has solved the problem by "word processors"
       with autocorrect functions.

    d. and e. apply also to the abbreviation discussion. I would however not fumble with the drum of a mechanical typewriter for "Melle", as the "ll" would end significantly above the "M" and as a matter of standardization not for "Mme" either.


    * Do I correctly understand the Unicode standard regarding this?
    * Are you advocating that some kind of formatting marks be added to Unicode for this purpose?

    Best regards


    THIS COMMUNICATION MAY CONTAIN CONFIDENTIAL AND/OR OTHERWISE PROPRIETARY MATERIAL and is thus for use only by the intended recipient. If you received this in error, please contact the sender and delete the e-mail and its attachments from all computers.

    -----Original Message-----
    From: Philippe Verdy []
    Sent: Mittwoch, den 29. März 2006 23:09
    To: Keutgen, Walter
    Subject: Re: How to encode abbreviations [Was: Representative glyphs for combining kannada signs]

    From: "Keutgen, Walter" <>
    > Philippe,
    > your remark is similar to Antoine Leca's of 2006.03.28 10:12 GMT
    >>One of my prefered is the French use of ° to mark
    >>the abbreviation of a final o (as in 1º, 2º), and the Spanish use of º to

    It is not: the degree after the number above does not designate a o letter. It is merely away to use a superscript instead of the actual ones. The actual superscripts used would be:

    "1<sup>er</sup>" (for singular masculine), "1<sup>ère</sup>" (or "1<sup>re</sup>", for singular feminine), and their plurals, instead of "1°" that any french would interpret as "one degree", but not as not the adjective "premier" or "première" (=first), or the adverb "premièrement" (=at first).

    "2<sup>nd</sup>" or "2<sup>nde</sup>", "2<sup>ème</sup>" or , "2<sup>ème</sup>" (or "1<sup>re</sup>"), and their plurals, instead of "2°" that any french would interpret as "two degrees", but not as the adjective "deuxième" or masculine adjective "second" or feminine adjective "seconde" (=second) or the adverb "deuxièmement" (=secondly).

    The important thing to note is that these abbreviations usingfinalletters in superscript have the regular desinences marked in the superscript. They are really made of the SAME letters used in the unabbreviate word, and so have the same desinences for feminine and plural. That's a good reason to not disunify the superscripts from their normal base letters: how would you produce the feminine (superscript e) or plural (superscript s) ????

    In fact it is not just a matter of style. The actual superscripted abbreviations are denoting something missing for the semantic: the fact that it is anabbreviation. So "Mme" is not clear about the fact that it abbreviates "Madame". It is however accepted in French without any additional sign or style, because French has very strong orthographic rules for abbreviations (for example, "Monseigneur" is abbreviated "Mgr", "Messeigneurs" is abbreviated "M.S.S.", "Monsieur is abbreviated "M.", not "Mr", and "Messieurs" is abbreviated "MM."), including strict rules for the capitals, and the position of abbreviation dots.

    The upperscript is another way to make the abbreviation even more explicit, but without adding an incorrect dot. The superscript does not violate this rule because it is a matter of style. But it exhibits that this is really an abbreviation, so the upperscript denotes an invisible abbreviation dot : "M<sup>me</sup>" merely represents "", but without marking the dot forbidden there. Orthographically the exact letters and case are important, and any attempt to disunify these letters will cause interpretation problems.

    So if something must be encoded in the plain text, it can only be a formatting control, something ignorable in collations, but added around the regular and normative letters and dots.The dots and letters in French abbreviations are immutable (very important for legal abbreviations used as trademarks, or to designate peoples precisely in contracts and letters).

    This rule translates to abbreviations used in other domains (so "ditto" is normally abbreviated by"d<sup>o</sup>",with a normal small o letter, but not with the degree sign, in accurate typography, due to the orthographic rules). However this abbreviated word is less important, and typically found in cells of numeric tables, or in very condensed articles such as in multiple lemmas for the same entry in a very rich dictionnary or encyclopaedia, or in a tourism guide. This word is avoided in legal contracts as it's too much ambiguous about its range of impact, so it would not be abbreviated there. This explains when strict typography is less observed for this.

    Anyway, the degree sign is often too little and too far from the baseline for denoting the superscript small latin letter o used in anabbreviation, because the degree sign normally aligns at the top of M-height digits.

    This archive was generated by hypermail 2.1.5 : Thu Mar 30 2006 - 10:38:09 CST