Re: 00B4 and 02CA

From: Jukka K. Korpela (
Date: Mon Sep 27 2010 - 00:37:13 CDT

  • Next message: Richard Wordingham: "Re: Lower Case l and Upper Case L with Candrabindu"

    abysta wrote:

    > What is the difference between 00B4 and 02CA?

    U+00B4 ACUTE ACCENT is a legacy character with ambiguous semantics.
    U+02CA MODIFIER LETTER ACUTE ACCENT is a modifier letter.
    The formal properties for these characters, as defined in the Unicode
    standard, reflect this difference to some extent. For example, the Unicode
    line breaking rules allow a break before U+00B4 but not before U+02CA.

    Chapter 7 of the standard describes the Unicode view on modifier letters:
    (page 28 in the PDF, page 250 in the standard).

    The general idea seems to be that legacy characters like U+00B4 were
    duplicated as modifier letters because ISO 8859 is ambiguous about their
    role as spacing vs. nonspacing. However, it seems to me that ISO 8859 says,
    somewhat obscurely but clearly, that all characters in it are spacing. On
    the other hand, in implementations, U+00B4 has often been used as a
    nonspacing diacritic mark. Moreover, it has often been used as a poor man’s
    right single quotation mark, e.g. as in `foobar´, meant to represent ‘foobar’.

    U+02CA is meant to be unambiguous as regards to its general nature as a
    letter (character used in words), though its specific meaning (e.g., as a
    tone mark when writing a tone language in Latin letters) has not been fixed.

    If you consider using U+02CA, note that not even fairly modern programs
    should be expected to treat it as a letter (e.g., so that double-clicking on
    a word containing it would select the entire word, instead of stopping at
    U+02CA). Such treatment is suggested, but not required, by the standard.

    Moreover, font support to U+02CA is rather limited; see


    This archive was generated by hypermail 2.1.5 : Mon Sep 27 2010 - 00:44:36 CDT