<<< Date Index >>>     <<< Thread Index >>>

[Mutt] #3325: attachment type misdetection for small .tar.gz



#3325: attachment type misdetection for small .tar.gz
------------------------------+---------------------------------------------
 Reporter:  antonio@xxxxxxxx  |       Owner:  mutt-dev
     Type:  defect            |      Status:  new     
 Priority:  minor             |   Milestone:          
Component:  mutt              |     Version:  1.5.20  
 Keywords:                    |  
------------------------------+---------------------------------------------
 Hi,
 we have found that there is an attachment type misdetection for a small
 .tar.gz file, you can see the complete bug report on
 http://bugs.debian.org/541241.

 I had a look at your code and in sendlib.c the mutt_make_file_attach
 function will take care of looking up the mime type, it will use
 'mutt_lookup_mime_type', then based on the extension it will report the
 mimetype back to mutt. Unfortunately .gz, .bz2 and other compressed
 formats are not classified as mime types because they are 'encodings, so
 you won't find anything in /etc/mime.types

 In that case (i.e.: no content-type found in mime-type) you're trying to
 guess if it's a binary file or not with this check

 {{{
     if (info->lobin == 0 || (info->lobin + info->hibin + info->ascii)/
 info->lobin >= 10)
     {
       /*
        * Statistically speaking, there should be more than 10% "lobin"
        * chars if this is really a binary file...
        */
       att->type = TYPETEXT;
       att->subtype = safe_strdup ("plain");
     }
     else
     {
       att->type = TYPEAPPLICATION;
       att->subtype = safe_strdup ("octet-stream");
     }
 }}}

 in this particular case this small .tar.gz file has the following info
 data:
 {{{
 (gdb) print *info
 $5 = {hibin = 43584, lobin = 8102, crlf = 522, ascii = 29871, linemax =
 2790, space = 0, binary = 1, from = 0, dot = 0, cr = 1}
 (gdb)
 }}}

 Unfortunately this results in a percentage of lobin chars slightly bigger
 than 10:
 {{{
 >>> (43584+8102+29871)/8102.0
 10.066279930881263
 >>>
 }}}

 so mutt believes that the attachment is not a binary file.

 How could you fix this?
 there are some options: you could link to libmagic and let libmagic cares
 about the mime-types, otherwise you could add a check on
 mutt_lookup_mime_type (sendlib.c) and cover the major encoding by
 assigning them the content type of application/octet-stream (.gz, .zip and
 .bz2 should be enough).

 If you want I can propose a patch, probably libmagic is the best option
 but I suppose that you don't want to introduce another dependency

 Cheers
 Antonio

-- 
Ticket URL: <http://dev.mutt.org/trac/ticket/3325>
Mutt <http://www.mutt.org/>
The Mutt mail user agent