Re: base64
On Tue, 2003-09-23 at 19:10, Lothar Kimmeringer wrote:
> On Tue, 23 Sep 2003 12:18:31 -0400 (EDT), Birl wrote:
>
> >Excuse my ignorance. I tried to pook around some B64 attachements in my
> >email files for an answer.
> >
> >
> >Are you stating that an =
> >
> >1) should not appear in B64 at all
> >2) should not appear in the middle of a line, but only at the EOLN
> >3) should only appear at the end of a B64 file
>
> See the corresponding RFC. The number of characters in a base64-coded
> text must be a multiply of 4. So ='s are used if there aren't enough
> characters and are added at the end of the text.
>
> = is not a valid character inside Base64 and an encoder should stop
> with an error or stops decoding.
>
> >Answering that question could help me better determine how to write a
> >procmail filter for this.
>
> /.*=[^=]/
> (untested)
RFC 2045 states (section 6.8):
"The encoded output stream must be represented in lines of no more
than 76 characters each. All line breaks or other characters not
found in Table 1 must be ignored by decoding software. In base64
data, characters other than those in Table 1, line breaks, and other
white space probably indicate a transmission error, about which a
warning message or even a message rejection might be appropriate
under some circumstances."
So, characters outside the 65 character set are "ignored", but "warning"
or "rejection" might be appropriate. Nicely ambiguous. This does not
apply strictly to '=', but might taken to apply to '=' within a base64
encoding.
"Because it is used only for padding at the end of the data, the
occurrence of any "=" characters may be taken as evidence that the
end of the data has been reached (without truncation in transit). No
such assurance is possible, however, when the number of octets
transmitted was a multiple of three and no "=" characters are
present."
This is also not clear. Does it mean that '=' can be taken to delimit
the data? Or does it mean no more than finding a '=' that means no data
has been lost?
So, there is ambiguity in RFC 2045, and this is the point of the
original post. Different people, and therefore different implementations
will have different interpretations. There is therefore potential for a
vulnerability when checks are performed using one interpretation but the
actual receiver uses another interpretation.
It would be nice to be able to enforce rules in email servers - there
are many ways in which messages do not conform to the standards - even
when they are unambiguous. But there are too many common email user
agents which generate non-conforming messages.
Or should we reject all these broken messages? ;-)
cheers
David Wilson David.Wilson@xxxxxxxxx
Isode Limited Tel: +44 (0) 20 8783 2961
http://www.isode.com