<<< Date Index >>>     <<< Thread Index >>>

Re: [Mutt] #2995: Mutt is folding subject line folds using a tab,



#2995: Mutt is folding subject line folds using a tab, which appears to be 
against
RFC 2822

Comment (by Kyle Wheeler):

 {{{
 On Wednesday, November 28 at 09:21 AM, quoth Mutt:
 > Please look up what WSP is in RfC2334 (it's the same as LSWP in the
 > older RfC822):

 Just because a WSP can be either a tab or a space does not mean that
 they are interchangeable and that they have no semantic meaning. The
 RFC says that you may only insert CRLF's to wrap headers, not swap out
 some characters for other characters of the same class.

 If we allow any character of the same class (e.g. WSP or US-ASCII) to
 be swapped out for any other character when wrapping, the example

      Subject: This is a test

 Could instead be "wrapped" to:

      Subject: This
       xx x xxxx

 Obviously these are not the same thing.

 It's true that there's some ambiguity in the first part of the
 standard, where it says

      The general rule is that wherever this standard allows for folding
      white space (not simply WSP characters), a CRLF may be inserted
      before any WSP.

 But it seems to me that what is intended here is that it could be
 rewritten as:

      ...a CRLF may be inserted before any EXISTING WSP.

 This is upheld by the later description of how to unfold a header:

      Unfolding is accomplished by simply removing any CRLF that is
      immediately followed by WSP.

 In other words, by merely removing the CRLF, we should have the
 original, pre-folding version of the header. Thus, when folding, we
 may only ADD CRLFs (in specific places), rather than give ourselves
 the freedom to delete and replace some of the characters of the
 original header.

 > i.e. WSP is either a space or tab character. Mutt does everything
 > correct.

 The idea is that folding is done by inserting CRLF's in strategic
 places, namely, just before WSPs. That doesn't mean we get to swap out
 one WSP character for another WSP character. It does NOT say that all
 WSPs are semantically equivalent, and can be swapped around according
 to the whims of the mail client. The WSP (in folding) is ONLY an
 indicator of where you can add a CRLF. The ORIGINAL WSP must stay
 intact.

 The header:

      Subject: This is a test

 May be folded like this:

      Subject: This<CRLF> is a test

 But NOT like this:

      Subject: This<CRLF><TAB>is a test

 > Though I think a tab is traditionally used, I didn't find evidence
 > that any folding WSP is to be treated as SP. RfC822 even says:
 >
 >   Unfolding is accomplished by regarding CRLF immediately followed by a
 > LWSP-char as equivalent to the LWSP-char
 >
 > which I interpret as tab for folding indeed means tab in the header.

 This was clarified in RFC 2822 to be more obvious that when unfolding
 a header you may ONLY remove CRLFs (and only those CRLFs that are
 followed by a WSP character), and that everything else about the
 header must remain as-is, including that WSP character. Note that RFC
 822 doesn't say that a CRLF-LWSP sequence is equivalent to *any* LWSP
 character; it says that a CRLF-LWSP sequence is equivalent to *that*
 LWSP character.

 Thus, if mutt is replacing spaces with tabs (which it is), a CORRECT
 unfolding of those folded headers MUST preserve those tabs.

 If mutt is transforming:

      Subject: This is a test

 ...into:

      Subject: This<CRLF><TAB>is a test

 ...(which is exactly what it is currently doing) then the only correct
 unfolding of this header is:

      Subject: This<TAB>is a test

 Which is obviously not desirable.

 ~Kyle
 }}}

-- 
Ticket URL: <http://dev.mutt.org/trac/ticket/2995#comment:>