<<< Date Index >>>     <<< Thread Index >>>

[IP] MTA packagers please consider SPF




Delivered-To: dfarber+@xxxxxxxxxxxxxxxxxx
Date: Sun, 04 Jan 2004 18:23:55 -0500
From: "Steven M. Bellovin" <bellovin@xxxxxxx>
Subject: Re: [IP] MTA packagers please consider SPF
To: dave@xxxxxxxxxx

>From: mengwong@xxxxxxxxxxxxxxx (Meng Weng Wong)
>Subject: MTA packagers please consider SPF
>To: chip@xxxxxxxxxxxxxxx
>Cc: dave@xxxxxxxxxx
>
>SUMMARY
>
>   This message goes to some of the maintainers of MTA packages for major
>   Linux distributions.  I ask that they consider adding SPF support to
>   their packages in helping to solve spam.  I founded pobox.com, which
>   produced the MySQL extensions for Postfix.  I authored the SPF
>   standard and am now trying to get it widely adopted.

Dave, I hope that people don't jump too rapidly on the SPF "standard".
From my persepctive, it's very far from ready for prime time.
(Note that although I'm a member of the IESG, I'm speaking as an
individual.  I'm not even saying how I'd vote if this document were
to come before the IESG today -- IESG evaluations are a deliberative
process, and I could very easily be talked out of some or all of
my points.)

There are several major problems with the document as written,
including its semantic model, the uptake model, and the "specsmanship".
The latter is easiest to fix, but it will render current implementations
useless if the eventual spec is different.  Running code is a great
way to test a concept; too much deployment of bad running code is
a tremendous obstacle to a decent standard.  I'm not even talking
of the perfect being the enemy of the good.

To make it easier to read, I've indented the major sections of this
note:

Specsmanship:
        The version number definition is problematic -- it only
        has major version numbers.  I suspect that we need minor
        version numbers as well, for operational debugging.

        The most glaring problem with SPF is the use of TXT records.
        TXT records are supposed to be free-form text, with no
        semantics attached.  The use of TXT for test purposes is
        understandable (though regrettable -- an experimental record
        type code would be better); the use of TXT records for
        textual error messages is not.  The document itself notes
        the problem of ordering of multi-record messages.  Beyond
        that, there are problems with internationalization:  what
        language should the error message be in, and in what
        character set is it encoded?  A simple URI would be a better
        solution; at the least, it should point to an SPFERR record.
        (Record subtyping in the DNS causes problems; see RFC 3445
        for some details on why.)

        The use of TXT-like records is problematic because it
        requires parsing an ASCII string in a DNS resolver.  (Yes,
        I know that NAPTR records require the same sort of parse.
        I don't much like that, either.)  The more complex the
        parse, the harder it is to get right, both for the author
        and the receiver of such records.  A TLV-based structure
        permits parsing by the author's DNS server, and is easier
        to interpret on the receiving end.

        The Received-SPF header line is badly specified.  It doesn't
        follow the the standards for other RFC 822/2822 headers
        (i.e., it requires exactly one space in certain places
        where an arbitrary amount of white space (including none)
        is permitted in other headers); it has some things as
        comments (receiving host) that should be parseable; and it
        doesn't mandate that Received-SPF lines from outside of
        the domain MUST be deleted.  (The actual requirements here
        are more complex; I won't go into details in this note.)
        Yes, the line as specified is a bit easier to parse, but
        any spam filter is going to have to deal with many other
        headers, and hence will have to have a full-fledged 822/2822
        parser.

        Too many cases can result in an "unknown" return value.
        That makes debugging hard.  There needs to be a "none"
        value, for cases where there is no SPF record; there needs
        to be a type code for "unknown", to distinguish among the
        many error cases.  Beyond that, the set of type codes needs
        to be enumerated -- as is, we'll see an operational nightmare.

        Section 5 speaks of using Received: lines.  Such lines have
        been forged by spammers for many years.  While they can be
        used, great care must be taken.  This document needs to
        define the necessary steps appropriately.

        5.1 speaks of cidr-lengths, but 5.2 et seq. speak of
        dual-cidr-length.  That looks like something where the
        editing hasn't caught up yet.  But having a CIDR length on
        an MX record is a bad idea, since there may be multiple MX
        records with different appropriate lengths.

        The macro language scares me -- it's very complex.  Note
        that DNS records are limited to 512 bytes unless EDNS0 is
        used; EDNS0 is not widely used today, and may incur other
        costs.  But the real problem is that the functionality
        permitted may be far too rich -- arbitrary semantics are
        not a good idea, since they can lead to random breakage
        when some site implements a test that another site isn't
        prepared to meet.  In Postel's language, a sender can't be
        conservative in what it emits if it doesn't know what the
        requirements are.

        8.4 ruins much of the effectiveness of the scheme -- it
        provides ways to avoid processing.  For example, a spam
        engine could send email with a local-seeming HELO, MAIL
        FROM, and From: entries, in which case (per Example 3) SPF
        isn't to be used.  Spam from abuse@ or postmaster@ can also
        bypass checks.

        The suggestion that this scheme become default in April,
        2004 (Section 9.4) is preposterous.  Even if the IESG were
        to approve this document today -- and very few documents
        are passed on first try -- it would take far longer than
        four months to build, test, and deploy production-grade
        clients and servers.

        The security considerations section mentions IP address
        spoofing, though the FAQ claims that they aren't real.  I
        agree that classical spoofing, per Morris' 1984 memo, is
        probably not a major threat here.  But spammers are using
        BGP to steal entire address blocks -- that's a bigger
        threat.  (The FAQ also points to RFC 2761 when it should
        be 2671.)

Uptake Model:
        As Rick Adams has pointed out, there is no consensus yet
        that this is the right way to go.  The major ISPs on the
        net -- AOL, Yahoo, MSN, etc. -- have not bought into this
        scheme.  Unless and until they do, it doesn't help much,
        either for their customers (who make up a substantial
        proportion of the user population) or for everyone else
        (since their addresses could be forged).

Semantic Model:
        In a strong sense, the part that requires the most debate
        is the semantic model.  SPF strongly binds a sender to some
        DNS records.  But that isn't always a good idea.  People
        who use portable email addresses will now be constrained
        to use the domain owner's SMTP sender, which may not even
        exist.  (A more interesting model would permit delegation
        of individual user names to particular sending machines.
        But that would probably require too much public key
        cryptography to be affordable.)

        The net effect will be to bind users more strongly to their
        ISPs and/or their employers.  While big ISPs may like that,
        it flies in the face of current (American) public policy
        -- witness local telephone number portability.  Ironically,
        it will also discourage a current anti-spam strategy used
        by many: throw-away email addresses for particular purposes.

        It will also make life harder for people who regularly use
        multiple sending email addresses.  For reasons of privacy,
        my children generally use email address that are not readily
        tied to their real names.  But for certain very important
        kinds of communication -- sending email to teachers, for
        example -- they use a family-linked email address.

        It isn't always clear to people what SMTP server they're
        actually using.  Over the last few years, I've noticed that
        one major hotel chain intercepts outbound SMTP connections.
        I don't know if they're trying to defend against check-in
        spammers or if they're trying to help travelers whose
        laptops are hard-wired to point to their company's or home
        ISP's SMTP servers.  Granted, people should use VPNs or
        SMTP/SASL for such things; too may don't and perhaps can't
        -- if your ISP doesn't support it, for example, you can't
        use it, and switching ISPs carries a non-trivial cost,
        especially if you have only one choice of broadband ISP.

        I've underscored some of my points here by using a portable
        email address of my own, rather than my usual email address.

Conclusion:
        The basic concept may or may not be a good idea.  The
        authors themselves admit that it's only part of a total
        anti-spam solution, and I'm not convinced that it's worth
        the deployment effort.  Its strongest in dealing with "joe
        jobs" -- spammers (and worms) impersonating real email
        addresses -- but that's the part that most runs afoul of
my semantic concerns.
-------------------------------------
You are subscribed as roessler@xxxxxxxxxxxxxxxxxx
To manage your subscription, go to
 http://v2.listbox.com/member/?listname=ip

Archives at: http://www.interesting-people.org/archives/interesting-people/