<<< Date Index >>>     <<< Thread Index >>>

Re: [Mutt] #2956: Recipient address broken if containing Š character (UTF-8 code: 0xc5 0xA0)



#2956: Recipient address broken if containing Š character (UTF-8 code: 0xc5 
0xA0)

Comment (by Rocco Rutte):

 {{{
 Hi,

 * Mutt [07-09-18 07:13:35 -0000] wrote:

 > > Mutt doesn't need to recognize //TRANSLIT strings. The fact that I
 have
 > > a //TRANSLIT string in my $charset should not have any effect on what
 > > Mutt sends. $charset is only a terminal-related variable.

 > Mutt tries to convert your terminal input in $charset into each of
 > charsets speciefied in $send_charset. If it fails with the first one, it
 > tries the second etc, but if all of them fail, mutt uses your terminal's
 > $charset.

 Some code to illustrate from rfc2047.c:

    /* Choose target charset. */
    tocode = fromcode;
    if (icode)
    {
      if ((tocode1 = mutt_choose_charset (icode, charsets, u, ulen, 0, 0)))
        tocode = tocode1;
      else
        ret = 2, icode = 0;
    }

 In case no item from $send_charset matches, mutt_choose_charset() fails
 and thus tocode remains $charset.

 Maybe the docs should clearly state this.

 > Thus we really need to fix the isspace() problem - and IMHO not only in
 > 1.6 but some simple fix is also needed for 1.4 and 1.5. In mbyte.c we
 > already have:

 > int iswspace (wint_t wc)
 > {
 >   if (Charset_is_utf8 || charset_is_ja)
 >     return (9 <= wc && wc <= 13) || wc == 32;
 >   else
 >     return (0 <= wc && wc < 256) ? isspace (wc) : 0;

 These are only used if your system lacks wide character functions or
 you told configure to ignore them.

 > So I think the easiest solution for 1.4 and 1.5 would be to write local
 > isspace() function the same way - and for 1.6 consider the proper
 > solution.

 Do you mean something like:

    #define isspace(c)    ((c) == ' ' || (c) == '\t' || ...)

 I see the need for fixing this quickly, but you really have to make sure
 nothing else breaks. So I'd rather prefer some analysis what places
 exactly are using isspace() and shouldn't it.

 > Personally I don't care if 0xA0 wouldn't be recognized as space -
 probably
 > noone uses NBSP to delimit several email addresses within recipient
 list.

 I didn't look it up, but I even think it's wrong to accept NBSP in such
 places. For mutt in general the only interesting places needing proper
 NBSP recognition are those which may brake lines (e.g. the f=f handler,
 header folding in the pager, etc.), IMHO.

    bye, Rocco
 }}}

-- 
Ticket URL: <http://dev.mutt.org/trac/ticket/2956#comment:>