2006-06-25

The (not always so) powerful valgrind

I don't think there's a respectable C programmer that hasn't heard about the valgrind tool for checking (among other things) memory access violations in a program. In a program that I'm writing, I was hitting an assertion failure where I shouldn't have had. Something lead my program to incosistent state, and I couldn't figure out what. It appeared seemingly random - usually a manifestation of some memory management problems. So I've run the program through valgrind, and - no errors (apart from those reported for the gethostbyname() function). With the help of hardware breakpoints in GDB, I've tracked down the problem to the following piece of code (roughly):

struct smth {
int state;
...
char buf[MAXBUF];
};
static struct smth a[16384];
...
struct smth *p;
...
p->buf[i] = 0;

At certain points in the code, the i variable was equal to MAXBUF, so it overwrote the state member of the next structure in the array. This is still within the bounds of the array, so valgrind didn't complain although it is a serious programming error.

I'm coding a user-level thread scheduler and using the makecontext() family of functions. This doesn't help either - the debugger gets very confused when trying to trace through such program. Apparently, it can't single-step over swapcontext() boundaries. So I had to put the hardware breakpoint on data change (for the state member) with additional condition that state is set to 0. I fixed the code by changing it to

p->buf[MAXBUF] = 0;
(in this case, this is correct, although not strictly equivalent to what was previously there).

Lesson: use assertions abundantly. Whenever you get an assertion failure, it's an indication that you have a wrong idea about your program's behaviour. Better to find that out sooner than later. And don't think that your program is error-free just because valgrind says so.

Tags:

No comments: