Re: U+00BA and U+00AA (was: "Re: Public Review Issue Unicode Technical Report #25, "Unicode Support for Mathematics"")

From: Philippe Verdy (verdy_p@wanadoo.fr)
Date: Thu Jan 25 2007 - 19:25:49 CST

  • Next message: Asmus Freytag: "Re: ZWJ, ZWNJ and VS in Latin and other Greek-derived scripts"

    From: "António Martins-Tuválkin" <tuvalkin@gmail.com>
    > or "MªJº" for often female given name "Maria João"). This makes the
    > Unicode names (with "masculine" and "feminine" on them) incorrect.

    Fully agree.
    More appropriate names would be "LATIN ABBREVIATION SMALL LETTER A" and "...O" for what appears as composite letter-like symbols.

    I don't say there that they "are" small letter a or small letter o, but that they "include" these letters as part of their meaning and presentation form (other information included in these letter-like symbols is the abbreviation notation itself which should be a superscript, as indicated by the compatiblity equivalence, but can also include an additional hyphen under it).

    But then there's the problem of coexistence of these symbols with the more general notation of abbreviations using normal letters with a regular superscripting. There are some usages where these abbreviations need to be consistantly underscored, or not underscored at all!

    The most obvious example is found in French with "n°<sup>s</sup>" or "N°<sup>s</sup>" (for "numéros") which MUST be distinct from the common French word "nos" or "Nos" (the possessive article, meaning "our" in plural form, and possibly capitalized)

    Same thing for the abbreviated plural of "f°(s)" = "folio(s)", or "r°(s)" = "recto(s)", or "v°s" = "verso(s)": how can we get consistant presentation of the plural mark?

    Same thing for abbreviating a final "-tion" syllable: often, the superscript o is used alone, but stricter typography requires following it with "n", so we should associate the superscript o (absent from french keyboard where the degreesymbol is most often used instead) with the superscript n (which is found on some French keyboards!). How can we get consistent presentation?

    Suppose we add a "LATIN ABBREVIATION SMALL LETTER S" and "... N". Then comes the question of many other abbreviations using the final letters of the unabbreviated word: all letters become possible, including letters with diacritics (for example, SMALL E WITH ACUTE in French).

    The main idea here, is that there's nothing in Unicode (or in the rich-text information of HTML <sup>) that marks that those letters are really denotating letters of an abbreviation, and Unicode makes the "ordinal masculine/feminine indicators" completely unassociated with the letters that they still represent!

    What would be needed is a couple of characters to explicitly mark such information, such as BABM (begin abbreviation mark) and EABM (end abbreviation mark).

    In that case, conceptually, we have these semantic equations:
    * "masculine indicator" = <BABM>, <LATIN SMALL LETTER O>, <EABM>
    * "feminine indicator" = <BABM>, <LATIN SMALL LETTER A>, <EABM>
    But any number of other abbreviations can be formed by including any sequence of letters between these special formating controls!

    In a rich text-format (XML, XHTML...), we would tag abbreviation marks like this:
    * "f<abm>os</abm>" for abbreviating "folios"
    * "add<abm>ns</abm>" for abbreviating "additions"

    This leaves the question of the exact presentation form to the renderer (consider the Braille rendering, or underlining or not, or consistant letter sizes...)



    This archive was generated by hypermail 2.1.5 : Thu Jan 25 2007 - 19:27:16 CST