<<< Date Index >>>     <<< Thread Index >>>

Re: CTRL-M everywhere



George wrote:
> Bob Proulx wrote:
> > I have a family member who sends mail from a broken mailer.  I can't
> > change them so instead I filter and fix the messages.  I have examples
> > of how to do this in procmail that I would post if there were
> > interest.
> 
> I trust it's not too off-topic.  Please do!

The problem I was having that I wanted to solve was that MS-Outlook
sends messages with every paragraph all on one line.  The lines can go
out to hundreds of columns.  It is as if they expected to set
format=flowed but forgot.  There are several[1] good references to
this subject.

It just frustrated me every time I got an email this way.  I could not
get them to configure the mailer correctly to avoid this.  I decided I
would preprocess the messages in just this one particular case to ease
my annoyance.  So I am filtering this one person's messages through
procmail upon receipt and fixing their MS-Outlook problem.  But
normally I would not want to modify messages like this.  (I have
changed the names in this example.)

  # Jane uses MS-Outlook and it is broken enough to annoy me terribly.
  # Try to fix it on the receive end.  What an abomination!  Set the
  # format=flowed parameter as the mailer seems to have assumed it.
  # Mail from any of these address, use scoring to produce an OR test.
  :0
  * 1^0 ^From: .*jane_doe
  * 1^0 ^From: .*Jane Doe
  * 1^0 ^From: .*Doe, Jane
  {
    # Mail from Exchange, add format=flowed, if not there already.
    :0
    * ^X-MimeOLE: Produced By Microsoft Exchange
    * ^Content-Type:.*text/plain
    * ! ^Content-Type: text/plain;.*format=flowed
    {
      CONTENTTYPE=`formail -x Content-Type:`
      :0 fhw
      | formail -i "Content-Type:$CONTENTTYPE;format=flowed"
    }
  }

This technique could therefore be extended to cover the problem the
original poster was having with quoted-printable.  I don't have an
example bad message with quoted-printable but no encoding type so this
is a made up recipe without all of the details filled in.

  :0
  * 1^0 ^From: .*addr1
  * 1^0 ^From: .*addr2
  {
    :0
    * ^some-tag-to-identify
    * ^Content-Transfer-Encoding:bad-encoding
    * ! ^Content-Transfer-Encoding: *quoted-printable
    {
      CONTENT_TE=`formail -x Content-Transfer-Encoding:`
      :0 fhw
      | formail -i "Content-Transfer-Encoding:$CONTENT_TE;quoted-printable"
    }
  }

I am using scoring "1^0" to make a logical OR test.  If any of those
rules match then the rule will be triggered.  This works nicely for a
few matches.  If I were going to match against a large number of cases
I would do this using grep instead of inline.

  :0
  * ? formail -x From: | grep -Fqsf $HOME/Mail/addrmatchlist
  {

Be sure to understand the portability issues with using 'grep' with
the -s and -q options.  Actually all of them, because traditional grep
does not have them.  But the above works with GNU grep.  Use something
like 'fgrep -f $HOME/mail/addrmatchlist >dev/null 2>&1' for a
traditional grep.

This grabs the From: address of the message and greps a match list
file.  If there is a match then the rule triggers.  The match list is
then maintained as an separate individual file.

Getting back to the quoted-printable issue, it is also possible to
change the mail encoding.  If you have a non-7bit charset and pass
through non-8bit clean paths the message should be automatically
converted to either a quoted-printable encoding or a base64 encoding
to pass through the 7bit path.  But after you have it you can convert
it back to an 8bit encoding.  Here is an example doing this using
mimencode.  This should remove any "=0D" sequences from the message.

  :0
  * ^Content-Type: *text/(plain|html)
  {
    :0 fbw
    * ^Content-Transfer-Encoding: *quoted-printable
    | mimencode -u -q

      :0 Afhw
      | formail -i "Content-Transfer-Encoding: 8bit"

    :0 fbw
    * ^Content-Transfer-Encoding: *base64
    | mimencode -u -b

      :0 Afhw
      | formail -i "Content-Transfer-Encoding: 8bit"
  }

These two rules both work this way.  If the Content-Transfer-Encoding
header is quoted-printable then filter (f) the body (b) through
mimencode waiting for it to finish (w) and have it change the message
to 8bit.  If that immediately previous rule succeeded (A) then filter
(f) the header (h) through formail waiting for it (w) and change the
header field to match the new encoding.

Using a technique like this the original poster should be able to
correct for almost any trouble from any particular mailer.  But really
they should not have to do so.  This is all the wrong end of the
string to be pushing upon.  The bad software should be fixed.  That
benefits everyone.

Hope this was interesting...

Bob

[1] http://www.google.com/search?hl=en&q=format%3Dflowed&btnG=Google+Search