Hello, Derek. Sorry for taking so much time to reply to your mail. I've been quite busy lately trying to solve this problem, and I have partly resolved it. See below. * Derek Martin <invalid@xxxxxxxxxxxxxx> [19-08-2004 12:33]: > Can you be more specific about when the problem occurs? Can you > provide a link to a mailbox which exhibits the problem? I have debugged it down to find out that the messages that display accents as a space in the index are the ones that are iso8859-1, but have attachments. The content-type shows as "multipart/alternative", or something else like that, and then don't define it's charset anywhere in the headers. In the pager view, the accent in this case show as "\341". However, look at this: From: =?iso-8859-1?q?Patr=EDcia=20Rosa?= <plsrosa2002@xxxxxxxxxxxx> Subject: Lim ozinho_safado!!!!!!!!!!!!!! Content-Type: multipart/mixed; boundary="0-1219368190-1092794218=:86288" The "From" is displayed correctly in the pager, but not in the index. The accent (the name is Patrícia Rosa) comes as a space. I think mutt should display it correctly in both views, as it seems to come correctly quoted. The subject comes as a space in the index and as \343 in the pager. However, it is not quoted in anyway, so I guess mutt has no way to know that space should be parsed as either utf-8 or as iso8859-1. In fact, when I copy/paste it into mozilla, that space shows as a square with FF F0 inside (which is clearly utf-8). > I'm not sure that there is... You could pipe it through iconv, but > you will probably break the mail. Yes... I haven't even tested it. What I have done that made some things work more correctly was recompile ncurses, passing "--enable-widec" to its configure script to enable wide characters support (utf-8). This initially broke some things, because when compiled this way, the library is named libncursesw.so, and not libncurses.so. A simpĺe symlink solved this problem, but I was told that this will break some things, as the two libraries are not binary compatible. Recompiling libncurses this way, and making mutt use it instead of slang resolved the problem with glitches in the index view. Before that, every line that had an accent would be displayed a bit to the left (as if internally the library was counting 2 characters from utf-8, and displaying just one). So, the From name would appear correctly, but the remaining fields would be displayed some spaces to the left. When I would press page-down to go to next screen, mutt wouldn't redraw the screen correctly, requiring a ctrl-l. However, linking mutt against this new libncurses don't solve the problem of the pager view. I'm quite sure this must be either a mutt error or a configuration error. > Ok, that's fine, but how are you starting gnome terminal? You didn't > actually answer my question... Even if you set your locale properly > in your .bashrc or whatever, the gnome-terminal may still be started > with a different locale if it is started by your window manager. For > example, if your system's default locale is different from that which > you have set in your .bashrc, the windowing system may not (and > probably won't) read your .bashrc before it starts, so programs > started from it will have the system's default locale, NOT the one you > defined in your .bashrc file. > > The above depends on your default system locale, as well as how the > windowing system was started... If the system's default locale is > en_US, and the windowing system is started at boot time, then programs > started by it may actually be started with a locale of en_US. This > will cause problems. Whole system is started as utf-8. GDM sets my locale to utf-8. All gnome applications seem to work fine with utf-8. The terminal's locale is en_US.UTF-8, and I'm not setting it in any script. > The problem here is that the characters are being output using an > encoding which is different from your locale. In other words, the > script is sending iso-8859-1 characters to your terminal, but the > terminal is interpreting them as UTF-8 characters. Any codes which > don't map the same in both locales will be errors. I know that... vim does the conversion well. Running that script shows spaces instead of characters, as expected. What bothers me is mutt not doing the conversion. > For characters to be displayed properly, all of the following must > match locale: > > - the locale which your terminal was started with > - the actual characters being sent to the terminal > - the locale of the shell being run by your terminal > - the font used to display the characters The font doesn't seem to be a problem here, as I can see the characters if I type them. The locale also seems to be right. All files I write are encoded in utf-8 by default, and every other things work happily. However, the characters being sent to the terminal seem to be wrong. I'm not sure wether this is a configuration problem or a mutt internal problem, but there surely is a problem somewhere. It doesn't seem to be a configuration error, as I have already tried to run mutt using 'mutt -nF /dev/null', and the problem is still the same. > In the case of e-mail, the received e-mail's MIME type must also > match. If the charset of the data being written to the terminal is > different from the charset of your terminal, then you need to use > iconv to convert it. However, if you have your settings set properly, > mutt should do this for you. > > So, please identify a specific message which exhibits the problem, and > then answer the following questions: > > 1. What is the character set named in the Content-Type field of the > message? > > 2. What is the complete output of the locale command on your terminal > prior to starting LANG=en_US.UTF-8 LC_CTYPE="en_US.UTF-8" LC_NUMERIC="en_US.UTF-8" LC_TIME="en_US.UTF-8" LC_COLLATE="en_US.UTF-8" LC_MONETARY="en_US.UTF-8" LC_MESSAGES="en_US.UTF-8" LC_PAPER="en_US.UTF-8" LC_NAME="en_US.UTF-8" LC_ADDRESS="en_US.UTF-8" LC_TELEPHONE="en_US.UTF-8" LC_MEASUREMENT="en_US.UTF-8" LC_IDENTIFICATION="en_US.UTF-8" LC_ALL= > 3. What is the system's default locale? Same as above. > 4. After you start mutt, what is the value of $charset (in mutt)? utf-8 > 5. Are you sure your font can display all the relevant glyphs? Yes, I can see the characters: ç ã á é ó í They all display fine here (c cedilla, a tilde, acute in a, e, o and i). > 6. What is the vaule of mutt's $send_charset? send_charset="us-ascii:iso-8859-1:utf-8" This isn't set by config file, it seems to be the default. The display problem regarding the index view was resolved by changing libncurses to libncursesw. However, the ncurses maintainer tolde me that mutt must take into account that utf-8 characters are 2 or 3 bytes wide, to display things properly aligned. -- Bruno Lustosa, aka Lofofora | Email: bruno@xxxxxxxxxxx Network Administrator/Web Programmer | ICQ: 1406477 Rio de Janeiro - Brazil |
Attachment:
pgpOP1mkj9vKT.pgp
Description: PGP signature