Re: HTML Validation (was Re: Clean and Unicode compliance)

From: Martin Duerst (duerst@w3.org)
Date: Sun Dec 16 2001 - 19:38:28 EST


Hello James (and everybody else),

Can you please send comments and bug reports on the validator to
www-validator@w3.org? Sending bug reports to the right address
seriously increases the chance that they get fixed.

Regards, Martin.

At 14:46 01/12/16 -0800, James Kass wrote:

>Elliotte Rusty Harold wrote,
>
> >
> > I suspect a lot of our tools haven't been thoroughly tested with
> > PLane-1 and are likely to have these sorts of bugs in them.
>
>Since Plane One is still fairly new, this is understandable.
>
>I'm also having trouble getting Plane Zero pages to validate.
>
>Spent several hours revising some of my pages as a result of
>some kindly off-list suggestions. (Most of the pages on my site
>were rewritten to pass Tidy.exe long ago, and apparently were
>already correct.) After getting the revised pages to pass the
>Tidy validator (which is also from w3), it was a big surprise
>that the first four pages checked with the W3 validator failed
>to pass.
>
>Amazingly, some pages didn't pass because " wasn't recognized
>as a valid named entity.
>
>After tidy warns that <STYLE> tags need a type element, went ahead
>and added them, but W3 validator insists that type elements in a
>STYLE tag invalidate the page if it is HTML 3.2 (IIRC) .
>
>Just for fun, tried validating a page from W3's own site,
>http://validator.w3.org/sgml-lib/WD-html40-970708/entities.html
>
>It failed too. A fatal error was generated because the page lacks
>the DOCTYPE declaration, and the validator just can't seem to get
>past that.
>
>There's an interesting article about how use of the DOCTYPE breaks
>existing web pages at:
>http://www.netmechanic.com/news/vol4/html_no22.htm
>
>One big issue with the W3 validator is that it doesn't seem to
>recognize charset=x-user-defined as a valid character set. Since the
>pages marked as user defined use NCRs, technically they could
>be considered to be in UTF-8 (since the pages are actually encoded
>in ASCII), but using the UTF-8 declaration in such pages breaks
>the display.
>
>M.S.I.E. has always behaved a bit erratically with UTF-8, although
>newer versions of the browser have offered slight improvements
>in this regard. Pages made with NCRs often display differently
>from identical UTF-8 pages even though there is no reason for
>this to happen. The NCR pages are usually the ones which display
>as expected.
>
>Correct display is paramount. Other issues are secondary.
>
>Best regards,
>
>James Kass.
>



This archive was generated by hypermail 2.1.2 : Sun Dec 16 2001 - 19:19:06 EST