<<< Date Index >>>     <<< Thread Index >>>

Re: Display Filters



On 2006-06-30, Dave Chandraratnam <davec@xxxxxxxxxxxxxxx> wrote:
> Gary,
> 
> On Thu, Jun 29, 2006 at 10:41:53AM -0700, Gary Johnson wrote:
> > 
> > I have a handful of scripts that I use for the display_filter, 
> > selected using message-hooks nested within folder-hooks.  They do a 
> > number of things.
> > 
> > -  Sed scripts that remove some or all the "[-- .* --]" comments 
> >    that mutt adds to MIME parts.
> > 
> > -  A demoronizer equivalent that converts certain Microsoft 
> >    characters to their ASCII equivalents.
> > 
> > -  The mail-to-filter script that compresses long To: and Cc: lists 
> >    to no more than two lines.
> > 
> > -  A perl script that adds ANSI color escape sequences around the 
> >    headings of a company newsletter to make it easier to read 
> >    quickly.  (Mutt's "color body" commands would work for this, but 
> >    they can't be cleared and can't be set differently for different 
> >    messages.)
> > 
> > -  A perl script that attempts to recognize and remove the spurious 
> >    double spacing added by Outlook/Exchange to text messages.
> 
> Would it be possible that you could send on the scripts that you have?
> The "Demoronizer" and "Exchange" ones in particular.

The mail-to-filter is here:

    http://www.spocom.com/users/gjohnson/mutt/#tocc

I wrote a Microsoft-to-ASCII converter in C before I knew about the 
demoronizer script, but since that script is widely used and has 
documentation, I'd recommend using it instead.  You can get it here:

    http://www.fourmilab.ch/webtools/demoroniser/

The Outlook/Exchange thing is a work in progress, but what I have so 
far seems to work pretty well.  Unfortunately, the detection of 
badly-formatted Outlook messages relies on certain characteristics 
of the message bodies I typically receive from Outlook users, so it 
may not work robustly in other environments.  It's currently just 
part of a larger shell script that's really too much of a hack to be 
posted in its entirety.  Here's the Outlook/Exchange part:

----------------------------- cut here -----------------------------

# Correct the formatting of e-mail from Outlook that was written in HTML
# and sent as multipart/alternative with text/plain and text/html parts.
# Extra newlines have been inserted by Outlook or Exchange and everything
# but one-long-line paragraphs appears double-spaced.
#
# The 'if' clause works when mutt identifies the text/plain part as an
# attachment.  This is the more robust rule, but mutt doesn't consider
# the text/plain part of a multipart/alternative message to be an
# attachment if the multipart/alternative content-type is specified in
# the message header.  The 'elsif' clause is tried only if the first
# fails and it works for a message that has the '\n\n \n\n' pattern
# between the first two paragraphs.  The second part of the 'elsif'
# clause (following the first '||') is a special case for messages that
# have the body triple-spaced below the greeting, instead of the
# conventional double-space.  The third part of the 'elsif' clause
# (following the second '||') is a special case for check-in notices,
# which begin with a 3-line block rather than an unbroken paragraph.
#
perl -pe '
    BEGIN {
        $/ = "[-- Attachment #";
    }
    if (/\[-- Type: multipart\/alternative/ && /[^ ]\n\n \n\n[^ ]/) {
        s/\n\n/\n/gs;
    }
    elsif (/^From([^\n]|[^\n]\n[^\n])+\n\n[^\n]+\n\n \n\n *[^ ]/s ||
           /^From([^\n]|[^\n]\n[^\n])+\n\n[^\n]+\n\n \n\n \n\n *[^ ]/s ||
           /^From([^\n]|[^\n]\n[^\n])+\n\n##-+\n\n## [^\n]*\n\n##-+\n\n \n\n 
*[^ ]/s) {
        s/\n\n/\n \n/s;                 # Put a space in the line
                                        # between the header and the
                                        # body to protect that line from
                                        # deletion by the following
                                        # substitute statement.
        s/\n\n/\n/gs;
    }
'

----------------------------- cut here -----------------------------

Here's the more conservative of my "[-- .* --]" filters:

sed -e '
        /^\[-- Autoview using .* --]$/d
        /^^[][0-9;]*^G\[-- Autoview using .* --]$/d
        /^\[-- Attachment .* --]$/d
        /^^[][0-9;]*^G\[-- Attachment .* --]$/d
        /^\[-- Type: .* --]$/d
        /^^[][0-9;]*^G\[-- Type: .* --]$/d
'

Note that the ^[ and ^G pairs should be replaced by a real ESC and a 
real BEL before using the command.

> I think that it would make mine (and many other peoples) life easier

Hope those are useful to you.

Regards,
Gary

-- 
Gary Johnson                               | Agilent Technologies
garyjohn@xxxxxxxxxxxxxxx                   | Wireless Division
http://www.spocom.com/users/gjohnson/mutt/ | Spokane, Washington, USA