From: Philippe Verdy (firstname.lastname@example.org)
Date: Wed Nov 02 2005 - 18:12:54 CST
Kenneth Whistler answered:
> For a *single* character, as seen in the table Philippe
> cited for the French Wikipedia entry, use of combining
> overscore is perfectly appropriate as a way to represent
> such text elements. For construction of entire, long,
> numerical expressions, it is not.
I did not demonstrate that. In fact, if you look closely to the effective
encoding of the French Wikipedia article at:
You'll see that it DOES NOT use any macro, or overscore diacritic. I also
demonstrated how it was defeating the separation of style and plain-text,
and that even style was not easy to make (for example the overline
decoration style can be applied only once, and two overlines requires
combining it with borders, that don't share the same coloring
("border-color:" style) as text elements that are colored with the "color"
Look again at the table named "Notation classique" lower on the page, by
clicking on the "Extensions classiques" title in the TOC (section number 6).
So I showed that this was just a tweak applicable only locally within a
given HTML page, provided that we can ensure how the page will be rendered.
As soon as one activates the accessibility feature of Firefox for example,
that disables all CSS and just renders the text with very basic layout, the
text looks simply wrong. It will be true as well for other types of
Using style markup to express the text semantics goes AGAINST all the
objectives of HTML and CSS, which plead for a clear separation between them.
The only way to solve the problem would be to express the intended semantics
not with style markup, but with structure markup (but structure markup in
HTML only provides for the document structure, at the paragraph level, and
nothing at the character level), or with the encoding of text-only HTML
elements. This would then require additional diacritics.
That's a good reason why a new separate structural markup was designed for
maths (MathML), with its own syntax, avoiding any attempt to pollute either
the unstructured stream of characters at the plain-text level, or the style
markup. XML was also designed to add structural information around text (it
was not designed to provide rendering style information, and even the most
common CSS syntax for style does not use XML in encoded documents, as the
XML perception of rendering style is infered within the HTML document
rendering engine, but not given by the rendered HTML document itself).
But for Roman numerals, the structure is the Roman number itself as a whole
: there's no such separation of interpretation between the characters and
diacritics that make a Roman numeral digit, and their actual meaning, as
this is their exact order and individual value that makes the whole number
interpretable (same as for words whose meaning depends on the exact order
and individual value of its component letters). Thinking that there's a
discrimination between base Roman digits and the diacritics that are applied
to them is non-sense: they form an autonomous and complete system by
So either they should be encoded COMPLETELY out of the plain-text stream
(yes there should then be no character at all for Roman numbers in the
plain-text stream), or they should be fully integrable as such.
This archive was generated by hypermail 2.1.5 : Wed Nov 02 2005 - 18:16:14 CST