<<< Date Index >>>     <<< Thread Index >>>

Re: Bypassing of web filters by using ASCII



On 21 Jun 2006 at 13:11, k.huwig@xxxxxxxxx wrote:

> 
> 1. problem description
> 
> The character set ASCII encodes every character with 7 bits. Internet
> connections transmit octets with 8 bits. If the content of such a
> transmission is encoded in ASCII, the most significant bit must be ignored.
>

Not quite. The most significant bit must be set to zero when encoding (from RFC 
20: "For 
concreteness, we suggest the use of standard 7-bit ASCII embedded in an 8 bit 
byte whose 
high order bit is always 0"). So a byte whose high bit is set is simply illegal 
in US-
ASCII. Which leads to the following point:

In case a message contains message description (in our case, charset 
specification, i.e. 
charset=US-ASCII) which is inconsistent with the message data (in our case, 
data out of the 
charset specification, i.e. bytes with the high bit set), what is a message 
reader to do?
Security-wise, the best would be to reject the message. Yet of course this 
leads to less 
than ideal user experience. So the obvious solution is to virtually modify one 
of the 
elements (either the message description, or the message data), so consistency 
is 
established.

Now, IE changes the data, i.e. sets each msb to zero, and thus establishes 
consistency - 
the data becomes valid US-ASCII byte stream. Firefox and Opera, I assume, take 
the other 
path, and modify the message description to read "ISO-8859-1", and thus 
establish 
consistency, as now the bytestream is valid ISO-8859-1 data. 

> Of the tested browsers Firefox 1.5, Opera 8.5 and InternetExplorer 6,
> only the InternetExplorer does this correctly, the others evaluate the
> bit and display the characters as if they were from the character set
> ISO-8859-1. 

So what I don't understand now is why IE's "solution" is any better than 
Opera/Firefox?

Why is modifying the data (msb) any better than modifying the data-description 
(charset)?

Please note: the attack you described is interesting and elegant. I'm just 
reserved about 
the statement that IE's approach is correct (vs. the other browsers). I was 
involved in 
research around similar situations wherein the strict RFC was violated, and 
different 
products interpreted data differently. And in such cases, I think we should be 
cautious 
about which product is "correct" (except that naturally, security-wise, it's 
more corrent 
to reject the message altogether).

Food for thought,
-Amit