RE: HTML5 encodings (was: Re: BOCU patent)

From: verdy_p (verdy_p@wanadoo.fr)
Date: Mon Dec 21 2009 - 14:45:21 CST

  • Next message: Joó Ádám: "Re: HTML5 encodings (was: Re: BOCU patent)"

    "Chris Weber"
    > The security issue is largely a red herring. Security of HTML encodings
    > is related to incorrect auto-discovery of encodings, not to using
    > encodings that have been properly announced. Even UTF-7, while
    > generally undesirable and unnecessary for Web pages, is "secure" if
    > correctly identified.
    >
    > Henri Sivonen stated that the main reason for prohibiting encodings was
    > to avoid "wasting developer time" and focusing attention on support of
    > new features instead. Apparently he didn't feel developers were capable
    > of both.

    This is a rellay stupide assumption: as if browsers were made of a single component, made by only one developers,
    and tested by nobocy else. In reality browsers now largely use components shared across various projects, and this
    includes a correct and exact charset support for handling plain-text and XML, a correct XML parser for handling
    XHTML, a correct implementation of the DOM tree (and its binding to Javascript), a correct CSS parser for mapping
    CSS styles to the DOM tree, a correct implementation of various renderers (for various types, not just plain-text
    but also image and audio codecs... all this requires very different skills and components really need to be
    developed separately and work like blackboxes with a minimal API.

    The question of charsets is really the least complex one to handle in a browser, and there's absolutely no benefit
    when trying to avoid it: in fact, by violating rules that were validated and tested for XML and past versions, the
    new prohibition will just create more problems than what it will solve, because it simply violates the intended
    target which was "compatibility with legacy applications" (exactly those that motivated the other violations, such
    as "extending" US-ASCII and ISO-8859-1 to Windows-1252, only because C1 controls were in theory forbidden in HTML
    and XML... except the NEXT LINE control inherited from EBCDIC and mapped at 0x85 in ISO-8859-1 and part of
    compressible whitespaces and of line separators, that will now be interpreted as allipsis in Windows-1252 !).

    Most of the violations introduced in HTML5 are just creating more problems than what they'll solve: most
    developments are now tower XML (and having HTML being more and more converted to strict XML conformance, even if
    most pages still have a few violations for strict XML conformance.

    Is HTML5 already a dead standard, exactly because of its lack of compatibility with BOTH the past (HTML4) AND the
    more and more present future (XML and XHTML)? This is what I think. It has no future like this (in fact the battle
    is not there: it is in the evolution of stylesheets, i.e. CSS3 where we should be more interested to have it support
    a better typography.

    What I really hope is that browser will prefer violating the stupid HTML5 rules, rather than risking violations with
    HTML4 or with XML and XHTML.

    Who suggested these violation rules? All seems to indicate Microsoft, as it really looks inspired by existing
    standard violations found in IE, and that Microsoft has promissed for long to correct (but if HTML5 is adopted like
    this, it will just reduce the remaining work for Microsoft, and will increase the work for others that will suddenly
    become in violation. But may be it was inspired by lazy webdesigners that don't want to support the newer standards
    and that still depend on existing "IE quirks".

    The more I read the HTML5 proposal, the more I see problems in it. The violations adopted on purpose are really a
    big hint to alert others: don't use it, keep HTML4 or go directly to XHTML. Thanks, there all major browsers
    (including IE) have already adopted XHTML). In additon, HTML5 comes really too late: working on it will just be a
    loss of time.

    Philippe.



    This archive was generated by hypermail 2.1.5 : Mon Dec 21 2009 - 14:46:38 CST