On Wed, 27 Oct 2004 06:32:07 PDT, Michael Wojcik said: > > "A program designed for inputs from people is usually stressed beyond > > breaking point by computer-generated inputs. -- Dennis Ritchie > > Moot. Since HTML is frequently computer-generated, HTML renderers shouldn't > be designed for human-generated input. Not moot at all. Remember that an array of test cases is human-generated inputs - and the renderer is usually mostly tested against said inputs. And even more to the point - automated testing isn't a panacea either. Just because you've "fixed" the browser so it doesn't crash when you point it at file:///dev/urandom (or moral equivalent) doesn't mean that you've achieved good coverage. Somebody else mentioned a "1 million random events" test for Palm programs. The problem is that although a test like that *will* shake out a lot of bugs, and is probably useful for that, it is *NOT*, repeat *NOT* likely to trip over the same bugs as somebody who looks carefully at the application, and realizes that if they feed it a string '../../../../../..' (256 levels deep), and then hit control-Z while it's evaluating that path, they get it into a state that you didn't want them getting it into. Remember - if you're not feeding it raw /dev/urandom, you're probably feeding it something "similar to html, but malformed" - and at that point you have a problem, because the program is restricted to testing the sorts of malformations you've taught it to generate, and it *won't* in general generate any malformed html that you've not conceived of (in other words, it almost certainly won't create a test case for one of the Unicode directory traversal attacks unless you've taught it about Unicode....) > I think that's a straw man, Valdis. HTML renderers should expect malformed > HTML input, and dealing with it is not difficult. There's simply no excuse > for buffer overflows and null pointer dereferences when processing HTML. > It's just not that hard a problem. It's not a matter of exhaustive testing; > the kinds of bugs found by Mangleme are basic ones that any code review > should have caught - if the code was written properly in the first place. I was speaking more in general - although it's *true* that there should be basic sanity checking, the *general* problem is that the programmer can't, in general, write code to protect against bugs he hasn't conceived of. > Basic input validation and sanitization isn't that difficult. Yes, *basic* validation isn't that hard. It's the corner cases that end up biting you most of the time. > I write comms code - client- and server-side middleware. I wouldn't dream > of implementing a protocol with code that didn't sanity-check the data it > gets off the wire. And you've *never* shipped a release that had a bug reported against it, and when you looked at it, you did a Homer Simpson and said "D'Oh!"? > I don't see any reason why browser writers shouldn't be > held to the same standard. Avoiding unsafe assumptions when processing > input should not add significantly to develompment time; if it does, you > need to retrain your developers. How much would it have added to development time to have closed *all* the holes *up front* (including *thinking* of them) to stop Liu Die Yu's "Six Step IE Remote Compromise Cache Attack"? Remember what David Brodbeck said, which is what I was replying to: > How many times have you seen a word processor > crash due to an unfortunate sequence of commands or opening a corrupted > file, for example? The point people are missing is that covering all (or even anywhere *near* "all") the "unfortunate sequences" or "corrupted files" is *really really* hard, Quite often, "unfortunate sequence" means something like "issue the command to open a file" followed by "hit 'cancel' while the program is waiting for the next block from the disk to feed to a decompressor routine, causing the program to fail to clean up all memory allocated during the decompression, because the decompressor routine thought keyboard events were blocked, but somebody else changed code so they weren't blocked, which doesn't do anything immediately fatal, but results in a double-free error the next time you try to print a file". Let me know when an automated test event generator manages to trigger *THAT* case. ;) (And if you think it's a totally spurious made up condition, go look at why zlib 1.1.4 was released - I've not exaggerated *that* much...)
Attachment:
pgpxBgCnIqNZ7.pgp
Description: PGP signature