<<< Date Index >>>     <<< Thread Index >>>

Re: mutt_FormatString() not multibyte-aware



Hi,

* Ludolf Holzheid [06-07-07 21:18:52 +0200] wrote:

For padding, we need to count the number of character cells, not the
number of characters, and thus, we need something like wcwidth().

Yes, I know.

For all input streams, we need to _extract_ the first multibyte character for padding (which is now: "take the first byte"). With UTF-8's continuation bit check, you can easily count how many bytes the character consists of. Afterwards you can easily have wcwidth() compute its screen width (character cells), I hope.

My only hope was that we can get away with something more lightweight than to go straight with wchar_t for all format strings like in Tamo's patch/ugly hack. As these are used in many places, a full wchar_t-based implementation may be technically correct, but a dozen of new function and conversion calls for each formatting step doesn't make mutt faster.

The later was my only concern and just I said that I hope we can get away with UTF-8 more cheaply because some people use UTF-8 natively so that for those, there won't be any conversion at all.

(I wonder why Tamo didn't jump into that earlier -- Maybe he's just
tired of explaining to the Europeans there are other scripts than
Latin.)

I understand and know that perfectly well. Maybe it's just that I didn't make myself clear enough.

  bye, Rocco
--
:wq!