"raised rings" and compatibility abbreviations - Was: [Flags] Waterford Steamship Co

From: Philippe Verdy (verdy_p@wanadoo.fr)
Date: Mon Jul 19 2010 - 11:37:49 CDT

  • Next message: Doug Ewell: "Re: Indian Rupee Sign (U+20B9) proposal"

    On 18/07/10 22:14, "António MARTINS-Tuválkin" <antonio@tuvalkin.web.pt> wrote:
    > On 2010.07.18, 13:03, Elias quoted and wrote:
    >
    > >> That one's even in my keyboard: "C.º". (Underline [optional],
    > >> depending on the typeface.)
    > >
    > > Isn't that the sign for "degree"? It could be used for a raised o
    > > anyway, I suppose. :-)
    >
    > Those are different symbols — the degree sign is never underlined and
    > it is a ring, hardly varies with typeface style, while the raised "o" is
    > an "o". There are more — here’s from the Unicode repertoire:
    >
    > ° U+00B0 DEGREE SIGN
    > º U+00BA MASCULINE ORDINAL INDICATOR
    > ˚ U+02DA RING ABOVE
    > ◌̊ U+030A COMBINING RING ABOVE
    > ◌ͦ U+0366 COMBINING LATIN SMALL LETTER O
    > ◌֯ U+05AF HEBREW MARK MASORA CIRCLE
    > ᅌ U+114C HANGUL CHOSEONG YESIEUNG
    > ᐤ U+1424 CANADIAN SYLLABICS FINAL RING
    > ᴼ U+1D3C MODIFIER LETTER CAPITAL O
    > ᵒ U+1D52 MODIFIER LETTER SMALL O
    > ◌〫 U+302B IDEOGRAPHIC RISING TONE MARK
    > 〬 U+302C IDEOGRAPHIC DEPARTING TONE MARK
    > ◌゚ U+309A COMBINING KATAKANA-HIRAGANA SEMI-VOICED SOUND MARK
    > ゜ U+309C KATAKANA-HIRAGANA SEMI-VOICED SOUND MARK
    >
    > (I may have missed some.) Of these, only two (U+00BA and U+1D52) are
    > equivalent.

    My opinion is that U+00BA is still different from U+1D52, even if both
    represent a raised letter o.
    - U+00BA (MASCULINE ORDINAL INDICATOR) **may** be underlined and it is
    typically used to write the abbreviation of numero (even if there's
    also a separate compatibility character for this abbreviation). In
    addition, it does not necessarily align with normal text rendered as
    superscript.
    - U+1D52 (MODIFIER LETTER SMALL O) however **must never** be
    underlined, and its vertical alignment is fixed by the design of
    glyphs for the Latin script (according to the internal font metrics to
    which it will remain coherent), and **must** be independant of the
    vertical alignement of styled superscripts. And it should not be
    compatibility decomposed to small letter o with stylable superscripts
    (as it could possibly not respect the intended font metrics). And it
    does not imply any abbreviation.

    So U+00BA is kept for compatibility with the various fonts built for
    ISO 8859-1, to denote abbreviations in plain-texts (without the
    possibility of using rich-text enhancements), but it has two possible
    distinct mappings in actual fonts (with or without underlining). And
    U+1D52 is not ambiguous at all, and should be used in orthographies.
    For noting abbreviations, a standard small letter o (U+005F) should
    preferably be used instead in texts, using styled superscripts.

    For HTML, superscripts used to denote final letters of abbreviations
    should use the styled superscripts and the normal Latin letters like
    this:
      <abbr title="numéro">n<sup>o</sup></abbr>
      <abbr title="Numéro">N<sup>o</sup></abbr>
    and this can be applied consistantly for all other abbreviation
    superscripts like:
      <abbr title="mademoiselle">M<sup>lle</sup></abbr>
      <abbr title="premier">1<sup>er</sup></abbr>
      <abbr title="primo" lang="la">1<sup>o</sup></abbr>
      <abbr title="primero" lang="es">1<sup>o</sup></abbr>
      <abbr title="second">2<sup>nd</sup></abbr>
      <abbr title="deuxième">2<sup>e</sup></abbr>
      <abbr title="secundo" lang="la">2<sup>o</sup></abbr>
    as well as in many other possible abbreviations. Those that will
    prefer to underline the abbreviation superscripts everywhere in
    Spanish texts can still do that in their CSS stylesheets:
      abbr sup { text-decoration: inherit; } // this may be changed to underline
      abbr sup:lang(la) { text-decoration: inherit; }
      abbr sup:lang(es) { text-decoration: underline; }

    Note that the same stylesheet may still use something else than
    underlining for making differences between the abbreviation "1o" in
    foreign Latin (for example italics) and in native Spanish (not
    italic), if the underlining option is prefered everywhere for all
    superscript abbreviations.

    So the compatibility U+00BA for the "masculine ordinal indicator" is
    probably not the best option in HTML or any other rich-text format
    which supports the superscript style, and that may even preserve the
    language information (and the same is true for the "feminine ordinal
    indicator" which behaves identically).

    It's just intended as a compatibility character that will be used
    remap some rich-text documents to plain-text only, by detecting and
    remapping only **a few** superscripts used in rich-text documents,
    knowing that such conversion from rich-text to plain-text only will
    lose the underlined/non-underligned author's preference, as well as
    part of the semantics :

    - For example, such conversion to plain-text only will loose the
    superscript information denoting abbreviations like "Mlle" for
    "mademoiselle" (not really ambiguous in French), as well as its
    semantic as an abbreviation for example in abbreviations like "Me" for
    "Maître" which may become confusable with the French pronoun "me"
    capitalized, as well as the intended distinctions between the "primo"
    and "primero" in Spanish.

    - But using U+00BA in this conversion to plain-text only may still be
    useful for preserve the semantic dictinction between the adjective
    "no" and the abbreviation of "numero" in English (and you may also use
    the character mapped specifically for compatibility for the 2-letter
    "No" capitalized abbreviation).

    Philippe.



    This archive was generated by hypermail 2.1.5 : Mon Jul 19 2010 - 11:42:14 CDT