Armenian numbering: findings, recommendations and request to CSS WG

From: Richard Ishida (ishida@w3.org)
Date: Fri Jan 30 2009 - 11:26:05 CST

  • Next message: Christopher Fynn: "Error on Language Codes page."

    This email attempts to:

    1. summarise the information I have gleaned so far from the various helpful and appreciated responses to my question about Armenian numbering in CSS, including, in particular, the advice from Armenian typographer Hrant Papazian (Thomas please thank him.)

    2. make some recommendations for what to do with CSS3's definition of list-style-type: armenian, and what course IE8 should plot, faced with a new implementation that could conform to a non-authoritative spec or existing practice, but not both.

    3. propose next steps for the CSS WG.

    For detailed discussion see the thread starting at [1].

    FINDINGS

    Most descriptions refer to 7000 as being represented by a single character (Ւ U+0552 ARMENIAN CAPITAL LETTER YIWN). In fact Daniels lists the combination ՈՒ (U+0548 U+0552) as representing a different sound with no numeric value.

    Hrant Papazian's explanation is that: [[
    U+0582 is a normal part of Armenian text, it's just
    that in the reform spelling it only happens after a
    U+0578 so some people like to reflect that in the
    alphabet itself. The problem is that if you have a text
    that uses U+0582 without a U+0578 preceding it (for
    example from the Diaspora, or somebody older than
    the reform :-) the reform alphabet is stuck. That's the
    reason the Unicode standard shows just U+0582 for
    that letter; you can then build whatever you need.

    In fact I've personally never seen U+0578 prefixing
    U+0582 in Armenian numbering, I guess because it's
    by nature an archaic thing so the reform doesn't jive.
    ]]

    The combination ՈՒ (U+0548 U+0552) may also be non-optimal because it is ambiguous, given that order is not necessarily constrained, and so it can also mean 7600.

    The character COMBINING CIRCUMFLEX ACCENT (U+0302) doesn't naturally span two characters as described in the CSS module, so this casts doubt on its appropriateness for a situation where numbers can be represented by multiple letters (ie. 7000).

    The character shown above 10,000 etc is very rare, and if forced to choose between a circumflex and a bar, Hrant would choose the bar. On the other hand Hrant guesses that it is more likely to be something like 055F: ARMENIAN ABBREVIATION MARK or patiw, however this is currently a spacing character, not a combining mark (though this may be an error in Unicode given [2]) and anyway is rarely used and looks archaic. In brief, it is rather obscure.

    The use of U+0585 and U+0586 as alternatives for 10,000 and 20,000 is not uncommon, but since 30,000 etc doesn't exist, this is not an elegant convention.

    Upper vs lower case:

    Hrant Papazian: [[
    Since Armenian numbering is older than the Armenian
    lowercase set, that's not so bad. On the other hand for
    stylistic reasons it's worth supporting lc numbering.
    ]]

    IE8 has already reverted to upper case for list-style-type: Armenian for the build, like other user agents.

    RECOMMENDATIONS

    Change CSS3 text so say one of the following:
    1. armenian is a synonym of upper-armenian
    2. the case of list-style-type: armenian is implementation-defined

    I would recommend that IE8 use upper-case to conform to the behaviour of all other major browsers tested.

    Change the CSS3 text to say one of the following:
    1. 7000 is represented by Ւ U+0552 ARMENIAN CAPITAL LETTER YIWN
    2. 7000 is typically represented by Ւ U+0552 ARMENIAN CAPITAL LETTER YIWN but some implementations may prefer to use ՈՒ (U+0548 U+0552), noting that this involves potential for ambiguity

    I would recommend that IE8 use the single character, in line with Firefox and Opera.

    Change the CSS3 text to say one of the following:
    1. armenian numbering is only defined in this specification as far as 9,999. Beyond that the representation is implementation-defined.
    2. armenian numbering is only defined in this specification as far as 9,999. Higher numbers will be defined in a later version of the specification.

    I'm not sure what to recommend to IE8.

    I would recommend that all browsers consider implementing upper-armenian and lower-armenian to provide Armenian users with a choice.

    NEXT STEPS

    I propose that

    1. if the CSS WG or followers object to the recommendations I am making for IE8's implementation, that they speak up now.

    2. the CSS WG formally notes that it must either make a decision on the recommendations for the CSS3 text above, or take those decisions now and documents them so that they can be added to the Lists module text when it is next edited.

    3. the CSS WG passes a resolution on whether it will do 2 or not.

    Hope that helps,
    RI

    [1] http://lists.w3.org/Archives/Public/www-international/2009JanMar/0023.html

    [2] http://www.evertype.com/standards/hy/n1395tech.html

    ============
    Richard Ishida
    Internationalization Lead
    W3C (World Wide Web Consortium)

    http://www.w3.org/International/
    http://rishida.net/



    This archive was generated by hypermail 2.1.5 : Fri Jan 30 2009 - 11:28:47 CST