- it is portable,
- can be easily viewed and edited with an ordinary text editor,
- is easier to debug.
Today I have told to myself: maybe you are wrong. Maybe you are again wrongly assuming that it would be slow. Maybe you could have used text I/O and LISP. So I've decided to set up an experiment which has shown that I didn't assume wrong this time. Read on.
I have written four (two for text and two for binary I/O) trivial C programs that
- Write the file of 25 million single-precision (4-byte IEEE) floating-point random numbers between 0 and 1 (both programs were started with the default random seed, so they generate identical sequences). In the text file, the delimiter was a single newline character.
- Read back those numbers, calculating an average along the way.
The results are most interesting; all timings are averages based on three measurements. Timing deviation was noticable only on the 2nd decimal place:
- Writing to text file=43.7 seconds; binary file=5.2 seconds; binary I/O comes out about 8.4 times faster.
- Reading from text file=13.4 seconds; binary file=3.0 seconds; binary I/O comes out about 4.4 times faster.
- File size ratio: binary comes out 2.25 times smaller (100M vs. 225M)
Overall, I think that "use text I/O" cannot be given as a general recommendation. For sure, I don't want to use it because I have files with gigabytes of binary data. Text I/O would be in this case even slower due to more complicated (more than one field) input record parsing.
As I see it, the only thing in favor of text I/O is its portability. The other two arguments in the beginning of this text are just a consequence of lack of adequate tools for binary file viewing/editing. There is a bunch of hex-editors, but none of them will interpret the data for you (i.e. you can't tell it "display these 4 bytes as IEEE floating-point number) or let you define your own structures composed of different data typed.
There is GNU Data Workshop. I can't comment on it as it is written in Java, and therefore do not want to (i.e., on NetBSD, can't) use it. Not to mention that it is GUI and I'm doing all my programing on the server via ssh.
After some searching, I've come across a hex editor that offers some of the capabilities I need and is console-based: bed. However, it has an unintuitive user interface although it is menu-driven.
If you know of a good, non-Java, console-based, flexible binary editor, please drop me a note. Thanks :)