From: Jukka K. Korpela (email@example.com)
Date: Fri Oct 28 2005 - 11:16:50 CST
On Fri, 28 Oct 2005, John H. Jenkins wrote:
> On Oct 28, 2005, at 9:00 AM, Andrew S wrote:
>> Those are separate issues from what I'm asking about. My question is "Why
>> should the Latin letters be used instead of the dedicated Roman Numeral
>> Characters which do (regardless of the reason) exist in Unicode?"
> 1) The dedicated Roman numerals only go up to twelve, with spotty support
> beyond that.
There are a few more, and you could always represent e.g. the numeral for
thirteen as U+2169 U+2162. So although the argument indirectly refers to
the fact that the numerals were included for rather specific purposes,
it's not logically convincing.
> 2) Without fancy keyboard pyrotechnics, the dedicated Roman numerals would be
> typed and deleted one at a time. E.g., if a user accidentally entered "VIII"
> and realized that there was one "I" too many, a backspace would delete the
> whole thing instead of just the final "I". This is not likely the behavior
> the user expects.
Probably so, but would this be a problem to users who consciously choose
to use those numerals? After all, we are not discussing the question whether
everyone should use them but whether some people could use them. Besides,
it wouldn't be the end of the world as we know it if a delete function
behaved in a somewhat unexpected way in such a case.
> 3) Having said that, the dedicated Roman numerals would still be appropriate
> to use in some limited contexts. E.g., if you're laying out Asian text
> vertically and want the Roman numerals "I" through "XII" to be interspersed
> *horizontally* in the text.
This is an important practical point. The intended use for the dedicated
Roman numerals is in such contexts, and this implies that their glyphs
should be expected to reflect that, i.e. to be suitable for use in
vertical text. Therefore they cannot be typographically very suitable
for "normal", horizontal text.
There is another argument, of much more general nature. The Unicode
standard says, in clause 3.7:
"Compatibility decomposable characters ... support transmission and
processing of legacy data. Their use is discouraged other than for legacy
data or other special circumstances."
(The Roman numeral characters that we are discussing are compatibility
decomposable to sequences of Latin letters.)
This all doesn't mean that there would be no _need_ for the dedicated
Roman numerals. It just means that the arguments against using them,
outside the specific scope of use, are probably stronger than arguments
in favor of them. For example, it _would_ be useful if a speech
synthesizer could read "Charles I" as "Charles the first" rather than
"Charles eye", and if the "I" were written as a dedicated Roman numeral,
the software could know that it is unambiguously a number, not e.g. the
personal pronoun "I". But this won't happen, for a multitude of reasons.
Such things need to be handled at other protocol levels, such as markup
(even though there's no useful general-purpose markup for such things at
-- Jukka "Yucca" Korpela, http://www.cs.tut.fi/~jkorpela/
This archive was generated by hypermail 2.1.5 : Fri Oct 28 2005 - 11:18:16 CST