<<< Date Index >>>     <<< Thread Index >>>

Re: [Mutt] #2995: Mutt is folding subject line folds using a tab,



On Wednesday, November 28 at 09:21 AM, quoth Mutt:
Please look up what WSP is in RfC2334 (it's the same as LSWP in the older RfC822):

Just because a WSP can be either a tab or a space does not mean that they are interchangeable and that they have no semantic meaning. The RFC says that you may only insert CRLF's to wrap headers, not swap out some characters for other characters of the same class.

If we allow any character of the same class (e.g. WSP or US-ASCII) to be swapped out for any other character when wrapping, the example

    Subject: This is a test

Could instead be "wrapped" to:

    Subject: This
     xx x xxxx

Obviously these are not the same thing.

It's true that there's some ambiguity in the first part of the standard, where it says

    The general rule is that wherever this standard allows for folding
    white space (not simply WSP characters), a CRLF may be inserted
    before any WSP.

But it seems to me that what is intended here is that it could be rewritten as:

    ...a CRLF may be inserted before any EXISTING WSP.

This is upheld by the later description of how to unfold a header:

    Unfolding is accomplished by simply removing any CRLF that is
    immediately followed by WSP.

In other words, by merely removing the CRLF, we should have the original, pre-folding version of the header. Thus, when folding, we may only ADD CRLFs (in specific places), rather than give ourselves the freedom to delete and replace some of the characters of the original header.

i.e. WSP is either a space or tab character. Mutt does everything correct.

The idea is that folding is done by inserting CRLF's in strategic places, namely, just before WSPs. That doesn't mean we get to swap out one WSP character for another WSP character. It does NOT say that all WSPs are semantically equivalent, and can be swapped around according to the whims of the mail client. The WSP (in folding) is ONLY an indicator of where you can add a CRLF. The ORIGINAL WSP must stay intact.

The header:

    Subject: This is a test

May be folded like this:

    Subject: This<CRLF> is a test

But NOT like this:

    Subject: This<CRLF><TAB>is a test

Though I think a tab is traditionally used, I didn't find evidence that any folding WSP is to be treated as SP. RfC822 even says:

Unfolding is accomplished by regarding CRLF immediately followed by a LWSP-char as equivalent to the LWSP-char

which I interpret as tab for folding indeed means tab in the header.

This was clarified in RFC 2822 to be more obvious that when unfolding a header you may ONLY remove CRLFs (and only those CRLFs that are followed by a WSP character), and that everything else about the header must remain as-is, including that WSP character. Note that RFC 822 doesn't say that a CRLF-LWSP sequence is equivalent to *any* LWSP character; it says that a CRLF-LWSP sequence is equivalent to *that* LWSP character.

Thus, if mutt is replacing spaces with tabs (which it is), a CORRECT unfolding of those folded headers MUST preserve those tabs.

If mutt is transforming:

    Subject: This is a test

...into:

    Subject: This<CRLF><TAB>is a test

...(which is exactly what it is currently doing) then the only correct unfolding of this header is:

    Subject: This<TAB>is a test

Which is obviously not desirable.

~Kyle
--
If you make people think they're thinking, they'll love you; but if you really make them think, they'll hate you.
                                                        -- Don Marquis

Attachment: pgpgPm9zAqUVE.pgp
Description: PGP signature