From: Philippe Verdy (firstname.lastname@example.org)
Date: Sun Aug 15 2004 - 10:22:20 CDT
From: "Doug Ewell" <email@example.com>
> Philippe Verdy <verdy underscore p at wanadoo dot fr> wrote:
> > Shamely,
> I wish I knew which real English word you mean by this. "Shamefully"?
> "Sadly"? "Unfortunately"? "Embarrassingly"?
I know that I use this word instead of "unfortunately". I don't know where I
learnt it, but I use it frequently...
> > the idea of "block-level" and "inline" elements is specific to HTML,
> > but HTML today is an application of XML, and the problem must be
> > solved at the XML level.
> HTML is not an application of XML. HTML and XML are both applications
> of SGML. XHTML, which I use and recommend, is an application of HTML
> *to* XML.
You did not need to specify this. I said "TODAY" which means the *current*
standard version of HTML, which is now XHTML, i.e. really an application of
XML (the legacy syntax with unclosed elements and unquoted attribute values,
allowed in HTML and SGML, is being deprecated as it is forbidden in XML)...
What I mean here is that a solution to disambiguate the grapheme cluster
boundaries that collides during normalization with the ?ML lexical analysis,
but that will work with the restricted XML syntax, will then work with
XHTML, HTML4 or lower, or even with SGML, which is the ancestor of the
It's a place where the W3C (for XML, XHTML and HTML4 or lower) and the SGML
consortium can make recommandations.
Of course there's the Unicode Technical Report #20 that speaks about the
case of XML. For Unicode, it is informative, the most important thing is
that this document is co-signed by the W3C, on 13 June 2003, and so is now
an appropriate (but incomplete) response of the W3C for this problem.
UTR#20 does not completely cover the subject, as there's still nothing with
the change in Unicode 4.0.1, related to the use of ZW(J)J in rule D17 and
May be Martin Dürst of the W3C should look precisely of the effect of D17
and if UTR#20 should not be updated...
I don't know if there's some similar recommandation from the SGML
There may also exist similar problems in other languages or protocols using
Unicode and which are possibly exposed now to this change which may break
their existing syntax. In some of these cases, the solution with NCRs will
not be so easy to find, and these other protocols or languages using Unicode
may need to apply further restrictions about what they consider as "valid
Unicode strings", or may simply choose to NOT apply the D17 change (so that
a string containing only a ZW(N)J character will still be valid and won't
collide with the language syntax).
This archive was generated by hypermail 2.1.5 : Sun Aug 15 2004 - 10:25:52 CDT