2005-07-13

On programming languages

Last days I'm fighting with C++ templates. I'm trying to write a mini-DSL for querying and filtering streams of data. The idea is to have provider operators. Each time it is called, it provides a new value. Using the Boost Phoenix library (a part of the Spirit framework) I've come close to being able to write something like the following:

until_eof(
mapcar(
binary_istream_provider(f),
_1 * 2),
cout << _1 << endl);

However.. writing the custom functors is very hard. The Phoenix-2 library has.. well.. nice documentation for users, but not so nice when you are trying to extend it. And each wrong attempt results in screenfuls of C++ gibberish of template instantiation errors. Actually, I'm starting to understand that error 'garbage', although not yet how to fix it.

Joel de Guzman, the main Spirit and Phoenix developer, is kindly trying to help me, but currently he's the only one on the mailing list answering my questions. I don't like the feeling of depending on a single person.

Some people have suggested to do it in LISP. Writing higher-order functions in LISP is trivial. But.. The data stored in the datafiles are binary C structs, suitable for direct reading into memory. Operating on such data representation is very cumbersome in LISP.

Another language that comes to mind is Python. It as convenient as LISP for writing higher-order functions and dealing with binary data is relatively simple (thanks to its struct module). But it is slow. Sequentially reading ~160 millions of records from a BerkeleyDB RECNO database is taking forever! I'm watching the disk bandwidth usage, and it is meager 0.5MB/sec! And the CPU consumption is 100%. BerkeleyDB can't be the bottleneck since it is very fast with C or C++. The disk obviously has greater bandwidth. That leaves us with a single bottleneck: the Python itself.

Why can't we have a nice, compiled, portable language compatible with C data types, which makes life easy for writing higher-order programs? Something like D or Cyclone. Unfortunately, both fail the premise of portability (e.g. I'm running NetBSD and FreeBSD..)

Oh well.. Maybe I'll just go back to the old, type-unsafe and very powerful way of representing everything with functions taking unspecified parameters (the f() parameter list; this is not the same as f(void) in C!) and void pointers. With a few typecasts inserted here and there, everything should work fine!

Oh, BTW, that Python is still running. More than half an hour and still not finished. I've managed to write this post while waiting.. I'll post here the final time.

1 comment:

Anonymous said...

Try ECL (http://ecls.sf.net). It's portable and has powerful straightforward integration with C. Including manipulation of C structures, of course.