Re: Roman Numerals (was Re: Improper grounds for rejection of proposal N2677)

From: Jukka K. Korpela (
Date: Fri Oct 28 2005 - 11:16:50 CST

  • Next message: Chris Jacobs: "Re: rejection of proposal N2677 (was improper grounds for ...)"

    On Fri, 28 Oct 2005, John H. Jenkins wrote:

    > On Oct 28, 2005, at 9:00 AM, Andrew S wrote:
    >> Those are separate issues from what I'm asking about. My question is "Why
    >> should the Latin letters be used instead of the dedicated Roman Numeral
    >> Characters which do (regardless of the reason) exist in Unicode?"
    > 1) The dedicated Roman numerals only go up to twelve, with spotty support
    > beyond that.

    There are a few more, and you could always represent e.g. the numeral for
    thirteen as U+2169 U+2162. So although the argument indirectly refers to
    the fact that the numerals were included for rather specific purposes,
    it's not logically convincing.

    > 2) Without fancy keyboard pyrotechnics, the dedicated Roman numerals would be
    > typed and deleted one at a time. E.g., if a user accidentally entered "VIII"
    > and realized that there was one "I" too many, a backspace would delete the
    > whole thing instead of just the final "I". This is not likely the behavior
    > the user expects.

    Probably so, but would this be a problem to users who consciously choose
    to use those numerals? After all, we are not discussing the question whether
    everyone should use them but whether some people could use them. Besides,
    it wouldn't be the end of the world as we know it if a delete function
    behaved in a somewhat unexpected way in such a case.

    > 3) Having said that, the dedicated Roman numerals would still be appropriate
    > to use in some limited contexts. E.g., if you're laying out Asian text
    > vertically and want the Roman numerals "I" through "XII" to be interspersed
    > *horizontally* in the text.

    This is an important practical point. The intended use for the dedicated
    Roman numerals is in such contexts, and this implies that their glyphs
    should be expected to reflect that, i.e. to be suitable for use in
    vertical text. Therefore they cannot be typographically very suitable
    for "normal", horizontal text.

    There is another argument, of much more general nature. The Unicode
    standard says, in clause 3.7:

    "Compatibility decomposable characters ... support transmission and
    processing of legacy data. Their use is discouraged other than for legacy
    data or other special circumstances."

    (The Roman numeral characters that we are discussing are compatibility
    decomposable to sequences of Latin letters.)

    This all doesn't mean that there would be no _need_ for the dedicated
    Roman numerals. It just means that the arguments against using them,
    outside the specific scope of use, are probably stronger than arguments
    in favor of them. For example, it _would_ be useful if a speech
    synthesizer could read "Charles I" as "Charles the first" rather than
    "Charles eye", and if the "I" were written as a dedicated Roman numeral,
    the software could know that it is unambiguously a number, not e.g. the
    personal pronoun "I". But this won't happen, for a multitude of reasons.
    Such things need to be handled at other protocol levels, such as markup
    (even though there's no useful general-purpose markup for such things at

    Jukka "Yucca" Korpela,

    This archive was generated by hypermail 2.1.5 : Fri Oct 28 2005 - 11:18:16 CST