RE: Fun with proof by analogy, was Re: Mojibake on my Web pages

From: Jill Ramonsky (
Date: Tue Sep 30 2003 - 06:44:50 EDT

  • Next message: Jill Ramonsky: "RE: Internal Representation of Unicode"

    Good point. But there has to be an actual attacker here, as in, a hacker
    engaged in a purposefully malevalent attempt to (say) run arbitrary code
    on a victim's machine (the victim being an end-user, a web-page
    viewer). To achieve this, the attacker must exploit "features" of the
    victim's browser. Yes, I was assuming that the attacker was a document
    author -- but if the attacker was a server (or at least, a server
    administrator), then it's difficult to see what a document author can do
    to guard against this. If the server is an attacker, they could of
    course modify all documents served anyway, in any manner they chose. In
    such a circumstance, document authors would be well advised to move
    their documents to another server ... assuming they ever found out.

    The attack is only theoretical, so far as I know, but basically it works
    like this: the attacker places a link to (say)
    "C:\WINNT\SYSTEM32\CMD.EXE (plus some nasty parameters)" in a hyperlink
    and encourages you to click on it. If all is well, the browser should
    forbid this. But if the string is written in encoding A, and the
    browser parses it assuming it to be encoding B, it is possible that the
    browser may not recognise the path as being absolute, and so may allow
    it. Of course, you'd have to try /really hard/ to find encodings A and
    B such that this becomes feasable, but you never know, it might be
    doable. Plus, you'd have to find a user dumb enough to be running a
    sufficiently old browser that it was still prone to this exploit. (I'm
    pretty sure modern browsers will have closed that hole by now, but
    again, you never know). But even a buggy and stupid browser will never
    fall victim to this exploit if the browser is able to infer the correct
    encoding for the document.

    But look at it like this. Suppose a html document had a meta tag which
    claimed: <META HTTP-EQUIV="Content-length" CONTENT=1>. In this
    circumstance, which would you prefer to believe: The HTTP Content-length
    header? Or the meta tag? (One can certainly imagine buffer-overrun
    exploits if browsers were to make the wrong choice).

    Of course, having said that, document authors /can/ affect HTTP headers
    directly anyway. If the document were to be written in PHP instead of
    HTML then a document author could generate any HTTP headers they wanted!
    (I've actually done this to deliver documents in UTF-8 against the
    server's default). All I can assume is maybe there's some sort of threat
    model in place which assumes that anyone who can code in PHP can't
    possibly be an attacker! If so, it's clearly nonsense.

    I still maintain, though (in agreement with Jon) that a server should
    obey the document author by taking notice of meta tags and transforming
    them into HTTP tags. (At the very /least/, it should take the meta tag
    as a hint, and use it as an HTTP tag if the hint turns out to be true).
    To ignore them altogether is just dumb.


    PS. I haven't mentioned Unicode domain names. That's a different kettle
    of fish altogether. Maybe we could have another thread for that.


    > -----Original Message-----
    > From: Peter Kirk []
    > Sent: Monday, September 29, 2003 5:33 PM
    > To: Jill Ramonsky
    > Cc:
    > Subject: Re: Fun with proof by analogy, was Re: Mojibake on
    > my Web pages
    > I know I don't understand all the issues here, but I think I spot one
    > flaw in the argument. This seems to imply that all security holes are
    > the work of the content providers and none related to the servers. In
    > other words, that all servers and their administrators are entirely
    > trustworthy. This is certainly not necessarily true. And if a content
    > provider can compromise security by confusing encodings, so
    > can a server.
    > This could become a significant security hole when we get
    > Unicode domain
    > names. A malicious server administrator could register the mojibake
    > equivalent of a legitimate security sensitive domain name and then
    > deliberately serve the mojibake version to users, etc etc.

    This archive was generated by hypermail 2.1.5 : Tue Sep 30 2003 - 07:43:38 EDT