Re: [Mutt] #714: Lines starting From aren't escaped saving from Maildir to mbox
#714: Lines starting From aren't escaped saving from Maildir to mbox
Comment (by rtc):
Replying to [comment:20 vinc17]:
> Replying to [comment:19 rtc]:
> > Any reader that does not know this sick convention will wreak havoc in
such mailboxes;
> This is not a flaw.
Of course it is a flaw, and a serious one.
> Mutt supports the Content-Length header, so, no problem for me. At least
this doesn't corrupt the message
That doesn't help you at all. Not mutt, but the delivery program writes
the mail to your mailbox, and it will usually do in mboxo format. You
message is hence corrupted before mutt even sees it.
> ("From " quoting is particularly evil when it occurs in an attachment).
I agree, but if you assume mboxrd, you will at least get back the original
attachment if it was written by an mboxrd writer.
> I've never seen any problem in practice (but I switched to maildir for
incoming mail a few years ago). Also, nowadays most users use filters
(mainly because of spam), and it's really easy to remove the Content-
Length header. Ditto for the Status header.
You are playing down the problem here and defining it to be non-existant
again.
> Also, without Content-Length, the mailer would be very inefficient on
large mailboxes.
It would be inefficient on mailboxes with large messages in them, though
you overestimate the effect here.
> > There is no such thing as "the expected format" of a "line starting
with 'From'".
> See the is_from function in from.c.
Did anyone ever have a mailbox where this would be necessary? A correct
is_from would return 1 if the line starts with "From ", 0 in any other
case.
> Instead of dealing with theory that doesn't match the practice, Mutt
tries to solve practical problems. And it does it well. [...] In any case,
I think this is fine that Mutt has some recovering mechanisms.
On the contrary, mutt seems to ignore practical problems and tries to
solve theoretical ones that are not actually there, creating real
practical ones by doing that. As jwz correctly says: "Some people will
tell you that you should do stricter parsing on those lines: check for
user names and dates and so on. They are wrong. The random crap that has
traditionally been dumped into that line is without bound; comparing the
first five characters is the only safe and portable thing to do."
> But assuming mboxo leads to the correct result most of the time.
Assuming mboxrd leads to the correct result most of the time, too.
> > You are talking about the mboxcl format above. This is by far not the
format supported by most software, either.
> It is supported by Mutt, and this is sufficient for most Mutt users.
It doesn't help users at all, because the mail is initially written not by
mutt, but by the delivery program, and the delivery program decides how to
quote the message, not mutt.
> > A From_ always needs to be quoted except if you assume mboxcl format,
> Not all of them.
They do need all to be quoted and they are all quoted, by any delivery
program known to me.
> > in which case you need to assume mboxcl, not mboxo, on reading the
mailbox,
> In practice, one can have mixed-form of mailboxes.
Yes, with mutt, you will usually result in mboxcl2, because it adds
Content-Length in addition to the quoting already done by the delivery
program.
> > Assuming mboxo format will never lead to fewer message corruptions
compared to assuming the mboxrd format. Of course you can always construct
specific examples where assuming the mboxrd format will corrupt the
message worse than assuming mboxo would. But these cases are very rare in
real world amd are never fatal.
> No, they are common and annoying.
Yes, if programs that do cryptographic signing already assume that the
message will be written in mboxo and so already do the quoting on the
sender part. If it is actually written in mboxo format, but you assume
mboxrd on reading, you will inevitably get problems and the signature
won't match. For this practical case, which is the result of a sick and
ugly kludge (for which the inventors should be shot; using any character
except '>' to place in front of 'From ' on the sender side would never
have brought the problem into existence), mboxrd is indeed not the most
practical choice.
> No information is lost. Content-Length is just meta-data.
Strictly speaking, it is part of the header of the messages, not of the
meta-data. The only meta-data that is present in mbox* files is the From_
lines.
> I really don't mind about your infinity of possible messages. Not all
possible messages occur in practice. You model is wrong, as well as your
deductions.
My model may not be perfect, but my deductions are not at all as wrong as
you try to suggest. I don't deny that you have some points. You
shouldn't completely deny my points, either. Please review my new patch
at #2976 which leaves the user the option of whether to add mboxcl
idiosyncracies to his mailboxes and whether to save fccs in mboxrd or in
mboxcl.
--
Ticket URL: <http://dev.mutt.org/trac/ticket/714#comment:21>