Re: New translation posted

From: Jukka K. Korpela (
Date: Sun Feb 04 2007 - 07:21:22 CST

  • Next message: "Re: writing Chinese dialects"

    On Sun, 4 Feb 2007, Philippe Verdy wrote:

    >> No, the Unicode standard clearly says that U+2019 is preferred as
    >> punctuation apostrophe. The character U+0027 should have a neutral
    >> (vertical) glyph, and usually has, though in some fonts it's slighly
    >> slanted.
    > But why then all French spelling autocorrectors are changing the weak
    > ASCII vertical quote into a curly apostrophe?

    My copy of MS Word 2002, with document language set to French (France),
    autocorrects U+0027 to U+2019 when it appears inside a word (as in
    "l'homme"), which is just fine and quite consistent with what I wrote. As
    Asmus wrote in his reply, the ASCII repertoire is largely an input
    repertoire, or something that people type in using common keyboards, and
    it is quite OK to autocorrect the input in a context- and
    language-sensitive way, as long as the user understands what is going on
    and knows how to undo or switch off the autocorrections as needed.

    For French, my copy of MS Word 2002 autocorrects U+0027 to U+2018 (left
    single quotation mark) at the start of a word. This means that by typing
    'foo' I get foo surrounded by single quotation marks. Whether such
    punctuation is acceptable in French is debatable. I have not managed to
    find an authoritative statement on nested quotations (inner punctuation
    marks) in French; by "authoritative" I mean something like issued by the
    French Academy. (The current CLDR data says that inner quotation marks are
    double quotation marks as normal quotation marks in English, but I have
    also seen statements about using single quotation marks or single
    guillemets and even double guillemets.)

    Anyway, if a user types 'foo' when typing French, then it is rational to
    expect that he wants to get single quotation marks, whether that's
    orthographically correct by all books or not.

    > i did not thought about the U+02BC modifier letter, but it is also an
    > alternative encoding with similar rendering, but it looks quite bad
    > because of its decomposition properties, and its glyph is the same as
    > the acute accent which is straight and too much horizontal.

    There's some confusion here, but it's actually water under the bridge.
    U+02BC is just something else - neither a normal punctuation apostrophe
    nor a quotation mark.

    > Note also that French has some usage of the other curly single quote
    > also as a letter (for transcribing some languages, notably the Arabic
    > aleph) ;

    That's because (probably unofficial) transliteration schemes use because
    their designers did not know better alternatives or were afraid of using
    them due to lack of sufficient software (font) support.

    The international standard for romanization of Arabic, ISO 233, uses left
    and right half ring to correspond to certain Arabic consonant letters.
    Various simple transliteration schemes use often either U+2019 or U+0027
    for one of them and leave the other out. By doing so, you choose to use
    characters with multiple semantics instead of specific characters. This
    might be a practical choice for various reasons, but has a more systematic
    and less ambiguous alternative, too.

    Jukka "Yucca" Korpela,

    This archive was generated by hypermail 2.1.5 : Sun Feb 04 2007 - 07:25:04 CST