2005-07-26

Strange NetBSD behaviour, C++ allocators and containers

I guess I'm one of the few people who have stumbled upon a very unpleasant implementation detail (some even call it a bug) of the NetBSD (2.0.2) buffer cache. The symptom is that the ioflush daemon (kernel thread actually) that is responsible for flushing dirty pages back to disk sometimes goes crazy.

I have done some internet searches before posting to the tech-kern NetBSD mailing list, but didn't come up with anything useful. You can read my complaint and the discussion that it has generated here.

The scenario in short: I have a data-processing application that is using BerkeleyDB. I have set up the BDB environment so that it uses mmap-allocated cache (512MB) backed by the real file on the file system. Pages belonging to this file get dirty quite quickly and the ioflush gets active relatively often. When that happens, my program stops (in the 'vnlock' wait channel) and does no useful work. I can't even access the file system where these files resides because of some locking issue (also mentioned in the thread above).

I have recognized that the problem might be in locking and at first I had blaimed the in-kernel NFS server. So I've recompiled the kernel, but the problem persisted.

As you can read in the referenced thread, I was so desperate about the problem that I was even considering to install another OS. Fortunately, in a moment of inspiration (after a good night sleep) I have remembered that I've seen SYSV SHM mentioned in the BDB documentation that I had browsed some time ago. Recompiling the kernel to allow 1G of SYSV shared memory and reconfiguring the BDB 'environment' to use SYSV SHM instead of mmap()ed file-backed storage worked around the kernel problem.

You might ask how didn't it happen before - because I haven't been using the BDB 'environment' until few days ago. In that mode, BDB allocates its cache memory by malloc() which eventually uses sbrk(). Unfortunately, in NetBSD you can't have more than 1GB of sbrk() heap (but you can mmap() almost all of 3GB available to user applications) and the BDB competed for the heap memory with my memory-hungry application.

One day maybe I'll implement C++ allocator on top of AT&T's vmalloc and corresponding discipline to allocate memory by anonymous mmap(). In C++ every allocator is bound to an object type which is fixed in size. This is ideal use for vmalloc's vmpool method for allocating objects of the same size within a memory region. This has several advantages:
  • no need to maintain free block lists
  • does not produce internal fragmentation (so it wastes less memory than 'ordinary' malloc implementation)
  • it is faster
Writing a corresponding C++ wrapper around Cdt might also be a good idea. When I was developing hashed text utilities (you can find it on freshmeat) I found out that Cdt is both faster and has less memory overhead compared to the C++ STL.

Oh well. Now I have everything working reasonably efficiently (both in time and memory) so that I can do my work. These plans I'm leaving for later. Or I'll do it while waiting, if I'm forced to switch to disk-based hashes.

It's a shame not being able to use full 3G virtual address space in NetBSD using standard C and C++ libraries.

No comments: