Re: URLs screwed in the mail body
Kyle,
On Sat, Mar 8, 2008 at 6:15 PM, Kyle Wheeler <kyle-mutt@xxxxxxxxxxxxxx> wrote:
> On Saturday, March 8 at 02:10 PM, quoth Francis Moreau:
>
> > Actually it would be better to fix the source of the problem instead
> > of trying to find a workaround... But I don't know where these URLs
> > get splitted at first. Perhaps you could enlight me ?
>
> Well, the SMTP RFC specifies a recommended maximum line length of 78
> characters, and a hard limit of 998. If a URL is longer than 78
> characters, many email clients will split the URL.
>
> One way that it's sometimes done, for example by Apple's email client,
> is as a format=flowed email with delsp=yes. What that does is allows
> them to indicate (with a single space at the end of the broken line)
> that the next line should be appended without a space separating them.
>
My last example of screwed URL has been sent by outlook. format=flowed
was not set.
So I guess there's nothing in the email header that could indicate that
the email has been reformatted to have a maximum length set to 78.
In this case, your approach seems correct: we have no other way to parse
the email body and try to detect splitted email. If we found some, then we
need to resplice them.
>
> >> macro pager \cb "<pipe-message>tr -d ' \r\n' | urlview<enter>"
> >>
> >> The second option seems to be the best... but it has the problem that
> >> it may concatenate urls that shouldn't be concatenated (For example,
> >> imagine the sentence "Go to http://www.google.com/ and tell me what
> >> you think" - the url would become
> >> http://www.google.com/andtellmewhatyouthink
> >
> > Why in this case spaces would be deleted ?
>
> That's what `tr -d` does: it deletes all the characters you specify.
> In that macro, I specified three characters: a space, a carriage
> return, and a newline. The reason I specified a space is because of
> Apple's format=flowed trick.
Sorry I didn't see the space caracter. My font is too small on that laptop
and it's hard to see it.
>
> What I wish we could do is use mutt to decode format=flowed and
> quoted-printable without having it feed things to the standard mailcap
> program when pipe_decode is set.
>
Well, it may be not enough to rely on format=flowed thing only, as I said
before.
>
> >> Perhaps it's better to do this:
> >>
> >> 3. Pipe it through lynx to extract the urls before piping it to
> >> urlview, like so:
> >>
> >> macro pager \cb "<pipe-message>lynx --force-html --dump |
> urlview<enter>"
> >>
> >
> > Yes. It would be nice to apply this macro for html emails only. For
> > text emails just do the usual/fast thing.
> >
> > Thank you for your usefull feedbacks.
>
> Yeah... unfortunately, that macro (I've been trying it all yesterday)
> doesn't *quite* work on all emails. Lynx sometimes gets confused by
> the message headers, I think.
>
But doesn't pipe_decode, when set, remove them ?
> I just put together a perl script that would do the trick instead of
> lynx. It's attached.
>
Sorry but I don't speak perl, so I can't comment. I'm wondering if
sed + a good regexp could do the job instead.
Thanks
--
Francis