The Core Dump of Thought: 2006

2006-12-24

AMD64 TLB invalidation performance

The AMD64 optimization manual specifies that the latency of INVLPG instruction is 101 cycles in 32-bit and 80 cycles in 64-bit mode. Considering that the TLB is so closely tied to MMU, CPU, and that no accesses to RAM are needed, I'm wondering why is it so slow? Even more interesting is the large (20%) difference between 32- and 64-bit mode.

How fast/slow is it on Intel CPUs? No idea. Their optimization manual gives instruction latencies only for a relatively small subset of instructions. INVLPG is not among them.

2006-12-23

Ashgabat

After hearing the news about the death of Turkmenistan's president, I searched the net to find some pictures of the capital, Ashgabat. I found this site and all I could think of after seeing the pictures was "Wow. I want to go there." (despite that summer temperatures can rise up to 45 degrees Celsius) Colourful and a bit surreal. Jewel found in an ex-Soviet republic. Who would have thought?

Peter Gutmann on Vista content protection

His article is a very interesting read. A quote: "The Vista Content Protection specification could very well constitute the longest suicide note in history." The scary part is about automatic picture degradation while listening to music and its potentially disastrous consequences in medical applications where crystal-clear picture is needed for correct diagnosis.

2006-12-13

How to not debug programs

I was drawn to this article, titled "Signals as a Linux debugging tool", by a recent link on the OSNews site. The subtitle writes "Intelligent signal handling finds bugs faster". Two things are wrong about this article.

The first thing is that the author's code examples have printf function in a signal handler to output register values when a fault happens. This is undefined behaviour, as the printf function is not listed as async-signal-safe in the POSIX standard. Ironically, an article titled "Use reentrant functions for safer signal handling" is listed among the references.

The second thing is that the author suggests an unbelievably complicated way of finding bugs. Once your signal handler with undefined behaviour has dumped the values of registers to your terminal, you are supposed to use objdump to disassemble your program executable, find the faulting location, and somehow map it to your program source.

I wonder whether the author actually knows how to make a program dump core when it faults, what to do with the core file, and how to use a debugger, such as gdb. (Hint: debuggers are much more powerful and convenient to use than what the author suggests in the article.) It's surprising that such a misleading, low-quality article can show up on an IBM's web site.

2006-12-06

Nokia 770

Recently I had the opportunity to get Nokia 770 Internet Tablet (basically for free - a loan on indefinite time). I declined. Previously I have owned both a PalmOS device (Visor), and an iPaq onto which I installed Linux. In both cases I couldn't think of anything useful I could use the device for. Otherwise it was just a waste of time.

This time I did have an idea - as a dictionary where I can note down new words, and as note-taker to replace my paper notebook. So I tried to play with it and it has the same problem as other PDAs - text entry. The on-screen keyboard you tap on is next to useless for real-time text entry. The handwriting recognition isn't much better. If you want high accuracy, you need to carefully and slowly draw letter by letter. I wonder why Nokia didn't buy patent for the Palm's Graffiti entry system. (Yes, there is an option of external Bluetooth keyboard. But that is too cumbersome - suddenly you have to fumble around with two small pieces of fragile equipment that you don't want to drop.)

An ideal PDA should come with software capable of learning natural handwriting. Here's how I imagine it could work. First you type in (using a keyboard) a relatively large text. Then you write the text (using your natural handwriting) on the PDA and let it figure out for what groups of letters composite pen strokes stand for.

This learning process would initially take more time, but it would be worth it in the long run. I don't think that today you can find a PDA which can in real-time recognize natural handwriting. Until such time comes, I think I will stick with pen and paper for taking notes
in real-time.

2006-11-27

C, C++ and width of integer types

Perpetual problem with C and C++ programming is very loose specification of integer types (eg. int must be at least 16 bits, without upper limit, long has to be at least 32 bits and not smaller than int). C99 solved this problem by introducing the <stdint.h> header.

The problem could have been solved by redefining the register keyword to mean:

I wonder how many programs would break, esp. if the register keyword would be no-op if its use would violate the minimum requirements for integer types (short, int: at least 16 bits, long: at least 32 bits).

I don't how useful would this redefinition be. Afterall, C99 does include "fast" integer types. Hm. Any opinions?

2006-11-25

Spammers have gotten smarter

Today I've actually read some text that got through the spam filter. Total nonsense that kinda makes sense. Here's an excerpt:

Popularity of blogs helped also to popularize concept web content mechanisms is such as or Atom have. Xml am perform operations instead using or Feeditem Item want or cannot changed exception readunread property in attached. System time or downloaded via Http Https parsed is normalized unified Identifies updated is Merges reflect last. Those tricky a details platform shields even of supports upcoming Support or Whether is implement innovative scenarios basically deal with Common Feed List. [etc]

This reminded me of a computer-generated text so I sought for Markov chain text generators. Here's one for example. Study its output (links are near the bottom of the page) and it'll be the same kind of "nonsense making sense".

Bayesian filtering is a kind of "inverse" of Markov chain text generation - both methods are based on statistics. The problem with the Markov-generated text is that its statistical properties closely match those of real text, so the Bayesian filter doesn't classify them as spam.

Generating garbage with required statistical properties is relatively easy; it just requires a list of words and a good Markov model. Once generated, it requires real human understanding for classification.

I didn't study theory behind Markov processes and Bayesian filtering deepely. I might be talking half-rubbish :) But given the amount and kind of spam that gets through the filter, I have a feeling that spammers are slowly winning the battle.

2006-11-20

10 Immutable Laws of Security

This is a nice essay.

New C++

This article describes the new features of the upcoming C++09 standard. (Read through it, there are other good links hidden inside). My favorite addition is automatic type deduction (unfortunately, not described in the article).

2006-11-15

NILFS for Linux

Today I've seen a reference to NILFS, a log-structured filesystem for Linux. It's interesting how they've put the most important "feature" at the bottom of the "Current status" page. Looking at the page, the first important thing one notices is that the work on garbage collector is ongoing. Doesn't sound well. At the bottom of the page, they conclude under known bugs with "The system hangs on a disk full condition." How nice :)

On a more serious note, I think that it's great that someone is working on alternative filesystems. NILFS can, for example, support "time travel".

2006-11-08

XML sucks (again!)

This isn't yet another of my anti-XML rants. There actually exists a web-site with such name- xmlsucks.org.

2006-11-06

I've registered myself with LinkedIn. It is a service to help maintain professional contacts. Doesn't cost anything and might be useful in the future.

2006-11-02

Makefile madness

Whenever I start a new project, there is one single thing I hate the most: maintaining a makefile. (Relatively) long time ago I found the article "Recursive Make Considered Harmful". It is both a critique of recirsive make, and a guide on how to write a good makefile.

I bit the bullet, applied the recipes given there (it didn't even take much time), and so far so good: it discovers automatically newly added files and dependencies maintain themselves. Without resorting auto*madness :)

2006-10-30

Software vs. hardware virtualization

This article [vmware.com] compares performance of software and hardware virtualization techniques. What is most surprising is that the results are mixed: hardware-assisted virtualization can actually be slower than pure software virtualization under certain workloads. Wow!

The benchmark was done with VT-enabled Pentium 4. It would be interesting to see how the AMD's Pacifica hardware virtualization compares.

2006-10-29

A small laugh

Being a Croatian citizen, I found this news article funny: according to the "Worldwide Press Freedom Index", USA ended up at 53. place , together with Croatia, Botswana and Tonga.

What's even more surprising is that some former communist countries, which only recently abandoned communism are extremely high on the list - eg. Czech Republic is at the 5th place.

2006-10-28

Anti-virus, virtualization and security paradigm

This is a very interesting interview with Joanna Rutkowska, the author of a "Blue Pill" rootkit. She just confirmed an opinion I had for a long time: that AV programs are mostly useless (heck, she doesn't even run one on her WinXP 64-bit machine).

AV detection is an inherently undecidable problem; therefore it will always be possible to create an undetectable virus. Without needing a rootkit that puts the OS into a VM.

Her wish (quote):

"The solution that I would love to have would be based on integrity checking of all the system components, starting from filesystem (digitally signed files), through verifying that all code sections in memory haven't been modified (something I partly implemented in my SVV scanner) and finally checking all the possible "dynamic hooking places" in kernel data sections."

is not realistic (unless the scanner is in the hypervisor) because of the question: How does the scanner ensure its own integrity?

What I would like to see is a paradigm shift in the security industry. It should put more weight on prevention and damage containment rather than source code auditing and scanning of programs/memory. Both techniques have been in use for a very long time and they don't work very well.

My view is that the OS should use the virtualization technology to create extremely light-weight, isolated environments; in the extreme case 1 VM per running application (this requires some heavy engineering to be doable efficiently - eg. sharing of the core OS code between VM instances). Each VM would expose only those parts of OS functionality that is absolutely neccessary for the application to work. Information flow between VMs would be strictly under user control (thus, making the user once more the weakest link in the chain).

There lie some heavy research questions in my proposal:

Efficient memory utilization (it would be infeasible to completely copy all of the underlying OS into each VM). Hypervisor would have to be intimately tied to the "guest" OS.
Policies for information flow between VMs.
Efficient history saving (so that user can roll back to some previous VM state).
Interoperability with other VM products like Xen or VMware.

Regarding the last point, there is an interesting comment in the AMD64 Pacifica manual for the VMRUN instruction under "Instruction intercepts":
"Note: The current implementation requires that the VMRUN intercept always be set in the VMCB."
Is this a hint that, in the future, we might get HW support for recursive virtual machines?

2006-10-18

Object-orientation

I have kind of despised object-oriented programming for a long time now (This was a result of bad experiences on large projects. It actually made things worse than better.). Until I found this section in Xavier Amatriain's PhD thesis. It is a nice view on the matter from Kristen Nygaard, one of the "fathers" of object-oriented programming (the other was Ole-Johan Dahl). His view I actually like.

After reading that section in the thesis and browsing through Nygaard's and Dahl's homepage, I felt a bit sad for not getting to meet them in person. Now I respect for OO as it was envisioned and find it a worthy idea. Misused in practice most of the time.

Tags: programming

2006-10-17

Interfaces and stability

These days there is much fuss around (not so) newly discovered bugs in nVidia drivers for Linux. Instead of being happy that a large software vendor has gone to trouble of providing drivers for a nonsignificant portion of their market share, users are whining about the "evil" nature of closed-source binary blobs being downloaded to kernel.

The importance of the bug is exaggerated. I consider it a bad practice to install any kind of advanced graphic capabilities on servers. As for desktops.. well, a plethora of bugs in other "desktop programs" already exist, this one doesn't make any additional threats beyond the existing ones. And it's simple to fix - don't use the driver.

I've found many complaints that nVidia's drivers are low-quality, unstable or just don't work. (Even today a friend complained to me.)

What is most fascinating is that users of these drivers are barking at the wrong tree (nVidia in this case): the real fault lies on the lack of official kernel APIs which are also ever-changing, to make the situation more difficult. And Linus is even proud of it, replying in the lines of "read the source".

IMHO, users are in this case a direct victim of such attitude. As I said, Linux is only a secondary platform to nVidia. They have no real financial incentive in keeping up with Linux kernel development. There is no point in constantly keeping behind myriad of Linux kernels with different patches and trying to make drivers work with every single one of them. Why? Because there's no stable kernel API.

Binary-only drivers (if written well, but that's beside the point here) work very well on Solaris, AIX and Win32. I don't know about AIX, but I know that Solaris and Win32 publish official driver development kits (DDK). Every 3rd party manufacturer can write a driver w/o relying on the "current state of flux" of the kernel and be reasonably certain that their investment in the platform is long-term. Something which is not the case with Linux.

I encourage users to stop buying the "binary blobs are evil"-nonsense and start asking the following question to Linux developers: "Why doesn't Linux have DDK?" If some DDK appears, Linux will maybe (just maybe) become a more attractive platform for hardware manufacturers. I believe it'd be easier to convince "big players" to write Linux drivers conforming to DDK than to convince them to publish HW specs. Until such time, users are "doomed" to reverse-engineered drivers, "black magic" (like ndiswrapper), buggy (like nVidia's), or simply no drivers at all.

And I fully understand reluctance of ATI and nVidia to open up specs. Opening up the HW spec can reveal much about internal implementation. And internals are what they are living of. Encourages competition. And in the end, it's the users who benefit of it. (Just imagine ATI copying every feature of nVidia with same performance and comparable price, and vice-versa. They would simply loose any incentive to further develop their chips. At least until a newcomer to the market appears.)

Tags: linux nvidia drivers

2006-10-07

Another critique of "free" software zealots

This article announces a completely "free" browser named IceWeasel, derived from the Firefox code. This article points out some problems that distributions like Ubuntu and Debian have while distributing Firefox, and why Firefox might not actually be "free".

In my opinion, they present a skewed view of the matters, unfairly picturing the Mozilla corporation as a "bad guy". Quote from the 2nd article: "Though Debian and Debian-derived distributions such as the popular Ubuntu Linux currently include Mozilla Firefox, they do not typically include the actual Mozilla Firefox logo."

The question to ask is: WHY do they remove Firefox of its logo? What do they put instead? Logos have extreme importance in todays world, and can be actually said to sell products (look for example at Nike). What kind of ethics drives these "free" software developers? It seems that not only they want to own the code, they also seem to want to own corporate identity. And be able to remove it from programs at their will, possibly replacing it with their own. That's using hard work of another company to promote themselves. If this is kind of ethics that RMS and FSF stand for, their effort is better renamed to "slave" software.

Tags: fsf gnu mozilla firefox

2006-10-01

The terrorists have won

Here you can see a small video showing violent reactions of very small quantities (as small as 2 grams) of alkaline metals with water. Imagine sneaking 2 grams of lithium onto a plane, buying a bottle of water on the plane and dropping lithium inside it. KABOOM! Or better yet, drop it into the toilet. Even if it doesn't crash the plane, it'll make an unforgettable experience for the passengers. Are metal detectors sensitive enough to detect 2 grams of any metal? Can the security officer examining your hand-baggage through x-ray notice any object weighing 2 grams?

And the new EU airline security regulations, that will take effect from 1.11., forbid carrying more than 1 deciliter of own liquids onto the plane. Who are they trying to protect and from whom? Just to make it clear, I have no intentions of blowing up planes or killing people. This post is a form of protest, and a way to point out worthlessness of most of these security measures. Esp. forbidding liquids. If terrorists want to mass-kill people, they can do it almost undisturbed at check-in waiting lines.

As a side note, people die. Nobody lives forever. More people die of cancer than have died in terrorist attakcs since 9/11. Yet much more money is spent on "war on terror" (it'd be better renamed to "war generating terror") and fear-propaganda about dangers of terrorism than on people's health. I wonder how many people would stop smoking, eating junk food and began living healthier in general if that much money were invested in anti-smoking and other health campaings.

IMO, the way to fight terrorism is not to take away freedom from people and giving it to the government (exactly what is happening now in eg. US) and corporations (eg. airline security bodies). The word "terror" comes from the Latin language, and its original meaning is fear, fright. Given this meaning, and considering how many people are afraid and frightened, I think it's fairly OK to say that terrorists have won. Not only are the people afraid, certain governments seem to be pushing their citizens into dictatorship. Slowly, but surely. (Just look at the new "torture law" in the US.) Exactly the thing they claim to be fighting against.

So how should we fight terrorism? First, stop being afraid. (That might not be in the interest of certain presidents, as the fear they themselves have generated by their propaganda is the only thing keeping them in power). Second, we as a society should adapt. As the human immune system adapts to bacteria and viruses, the society should adapt to terrorism. As with diseases, there will always be random casualties. But random casualties are already all around us (home-accidents, car-accidents, drug overdose, medical mistreatment..); why do we have to single-out terror-accidents (I purposefully use the word accident here!) and make a fuss about them?

I don't have a recipe for the "adapt" part. People do not want to be killed by terrorists. People do not want to live in fear. People do not want war. I believe that people will cooperate on their own with police to prevent bad things from happening, only if given a chance. But as long as they are afraid, they won't dare take that chance even if given.

Good examples of the latter reasoning are arrests for attempted attacks in London and Denmark. That's commendable. But stricter security regulations are not justified. It's like fighting diseases by forbidding bacteria. It doesn't work.

Tags: airline security terrorism

2006-09-28

How do commitees invent?

This article is still a fascinating read, although it dates back to 1968. It analyzes some fundamental aspects of large systems development, and has some nice insights. Although I didn't read the book "The Mythical Man-Month", reading the article reminded me of the title.

Tags: business management organization

2006-09-26

In defense of Tcl

I found link to this article on reddit, and following the discussion, a link to this article. If you're short on time, I recommend you to read the 2nd one.

tcl programming

2006-09-24

Category theory

I have just finished reading the book "Conceptual Mathematics: A First Introduction to Categories". It is one of the best mathematics books I've read. Not only because the exposed theroy is very interesting (indeed, I've learned deeper meaning of some things that I've taken for granted until now), but also for its exceptional presentation style: concepts are explained through many illustrated examples in an accessible way, without delving into deep abstractions.

If you want to read an accessible introduction to category theory, I can heartily recommended this book.

Tags: mathematics books

2006-09-13

How the questions shape the answers

Many things today are advertised based on evidence obtained by polls.
This article [PDF], although a bit old (dating to 1999), is a fascinating read. Basically, polls can end up with dramatically different results, depending on the way the questions are asked. I'm wondering whether marketeers use these tricks as described in the article when constructing polls that should come out in their favor..

Tags: psychology marketing

2006-09-09

A critical view on upstart

There has been lately much fuss about upstart which is supposed to be a single replacement for several daemons: SysV-init, cron, inetd, hotplug... This article is a commercial trying to sell upstart, but somehow it hasn't convinced me.

The first reason I'm not comfortable with the idea is that UNIX is built on the philosophy "one tool for one job". Every tool should do one job and do it well. Merging several different tasks into one program just feels "yucky". It feels "windows-way".

The second reason is security and stability. Take for example cron. Even though it has a seemingly simple task, a very popular implementation, vixie-cron, had some security bugs in the past. Now it's going to be reimplemented again. Not to mention that upstart then becomes a single point of failure. Imagine e.g. remotely induced reboot or kernel panic by triggering some bug in upstart's networking code and making it crash. (And since it's running instead of init, it'll bring the whole system down).

Rest of this post is a dissection of the article cited above.

The first part of the article is what I call "Problem setting." In trying to explain why SysV init doesn't work today, the autor says "The simple answer is that our computer has become far more flexible." and enumerates certain situations which do not really pose a problem. Most of them are related to hotplugged hardware which is already handled (I see it working nicely on RH and SuSE). He concludes with "We've been able to hack the existing system to make much of this possible, however the result is chock-full of race conditions and bugs." While I admit that there may be some problems, saying "chock-full" would be a blatant exaggeration.

Question 1: Why replace replace everything instead of sticking with the UNIX philosophy and making the current system better?

The second part is "Design". On the surface it seems sane and well-designed, but take a look at the example list of events; the most striking one for me is "the root filesystem is now writable". He doesn't say who is supposed to generate these events! This is a shift of responsibility from getting the startup script ordering right to generating the right
events at the right time. Currently we have a small, well-controlled set of dedicated processes, and the upstart system seems to lead towards an explosion of possibilites along at least two dimensions: kinds of events and when they are generated.

Question 2: Who is generating events? Who is writing event handlers? If the event handling system is extendible, how is the system integrity guaranteed (so that the faulty handler doesn't bring the whole upstart process down)? What happens when an event isn't handled because a handler is missing? Is it an error, how is it reported and to whom, is it
simply ignored..?

The third part is "showing off" or FUD-ing. Showing existing tools in black light in order sell "upstart" better. This is the funniest part! Namely, the author doesn't seem to find good arguments against initng, a dependency-based system, so he resorts to ridiculous argumentation: "However this means that you need to have goals in mind when you boot the system, you need to have decided that you want gdm to be started in order for it, and its dependencies, to be started.", continuing with "[..initng] It can reorder a fixed set of jobs, but cannot dynamically determine the set of jobs needed for that particular boot." and finishing with "initng starts with a list of goals and works out how to get there, upstart starts with nothing and finds out where it gets to."

Question 3: How is the computer supposed to figure out, even before it is turned on, what the user has in mind and what should be the target configuration? How could it know that the user on a particular boot wants e.g. xdm to boot, without any user input (e.g. without being given a goal)?

upstart seems like a solution to an invented (or, to say the least, exaggerated) problem. I hope the author does better job of coding than argumenting its usefulness.

[From personal experience, dependency-based system is used on FreeBSD, NetBSD, and on Gentoo Linux. It's very easy to maintain, and I like it better than SysV-init style boot process.]

[Another note: One should distinguish between the init program and the SysV-init style boot scripts. It is possible to use the (SysV-)init program, with a dependency-based system. And that's exactly what Gentoo is does.]

Tags: linux upstart ubuntu

2006-09-06

Deliberate bad engineering

Today I discussed a simple problem with a colleague. He wants to design a simple format for representing graphs (nodes and arcs). He said that it's probably going to be XML-based to which I replied that it is a very bad engineering decision (see below for short explanation why and other choices). He agreed to that, but he's going with XML anyway. He said that today's IT industry is full of people "falling" for 3-letter acronyms and that he just wants them off his back. So, from the engineering viewpoint, the better solution has lost because of XML's "psychological effect". I imagine something like "It uses XML, therefore it must be good." Bullshit.

Problem with XML is that it's not very human-friendly and it's complicated to parse. Yes, you have ready made parsers, but you still have to walk the parsed tree. I suggested embedding an interpreted language such as Lua or Tcl. Syntax is definetly more readable than XML, parser is there, the user gets additional power (e.g. programatically constructing the graph instead of tediously enumerating nodes and arcs), and there is no "walking the tree". "Tags" in the scripting language can be bound to C functions and made directly executable, thus constructing the internal graph representation as the graph description is read, w/o subsequent walking of the tree.

The better solution is obvious, he agrees that it's better, but he's still not going to use it because of the "buzzword effect". How many projects have ended up taking the buzzword route, sometimes to their own detriment (I've myself participated in one such project.. I suggested otherewise, but the management decided to go the "3-letter way", and the project was more than half a year late).

Tags: software engineering programming xml

2006-09-02

An idea for gmail

How about revoking emails? Namely, the user sending something to @gmail.com could also send a special "mail revoke" message to revoke his sent mail. If the sender is also a gmail user, he could be offered a simple GUI option. To eliminate potential privacy concerns, the feedback would be either "revoke received" or "trying to revoke invalid message". Any other kind of message would let the sender know whether his mail was already read or not.

What would be the safe contents of such message? Something like SHA1(sender_address || SHA1(mail_body)) would be sufficient, although crude in the first iteration.

More importantly, mechanisms to securely "revoke" only own messages are available, why isn't such option already implemented?

Tags: google gmail

2006-08-31

Rant and new stuff

It's a long time since the last post. In the meantime i have been on vacation for 10 days, and at least 1.5 weeks w/o a decent net connection.

I have continued to work on PKCS#11 support for GnuPG. I just have to say that I don't like how the code looks like. On the one hand, a set of mutually cooperating processes seems like a good idea. On the other, it's almost impossible to debug when writing own extensions, and the lack of documented error-handling protocol doesn't help either.

On the up-side, I have written a small C tutorial oriented towards OS development. It is aimed for students enrolled in the OS course taught at University of Oslo and University of Tromsø. You can find it here.

Tags: gpg gnupg programming c

2006-08-07

Prime sieves

Here I have described a relatively simple way to cut down the space requirements for computing the Sieve of Eratosthenes.

Tags: mathematics programming algorithms primes

2006-08-05

The plague of mailing lists

Open source is plagued by mailing lists. If you want to report a bug, how do you do it - well, in most cases it boils down to sending a mail to the project's mailing list. It gets worse: you have to subscribe before you're allowed to post anything. Subscribing/unsubscribing to a mailing list just to report a bug is a major hassle. Recently I've reported bugs for two different projects[1] directly to authors, and haven't received any reply. Maybe they received it, maybe got eaten by their spam filtering, maybe they just don't have time. Bottom line: I don't care. If they want to receive bug reports from their users, they'd better find some more convenient way. For example, take a look at how gentoo handles it. Reporting a bug to the gentoo project was very easy, painles and I didn't experience it as a hassle at all.

[1] One of the projects is Ingo Molnar's linux realtime preemption patch.

Tags: open source linux mail gentoo

2006-08-02

Switching to emacs

I've pulled emacs 22 from the CVS, compiled it and switched to it from Xemacs. Main reason is that I want to use the org-mode (included with emacs 22) to keep collection of notes. Although it is supposed to work also with xemacs, i haven't been able to byte-compile it; it fails with an obscure error message. I did report a bug to the author, and I'm waiting for an answer now. I have a feeling that xemacs is now lagging behind emacs development, and it has definetely less community support on freenode irc. Moreover, it seems that most packages are first developed for emacs, and only then (maybe) ported to xemacs.

Tags: emacs xemacs

2006-07-23

C++ std::vector problems

Inspired by a usenet discussion, I have described some problems in the current C++ definition and implementation of allocators and vectors. You can read more about it here.

Tags: C++ STL programming linux

2006-07-20

Few links

Few days ago I stumbled upon this very nice and high-quality blog about math. This is another useful link with the FXT book and algorithm library. I'm amazed what some people generously give out for free.

Tags: mathematics programming books

2006-07-17

Another GPL rant

You might already know that I'm less than a fan of GPL. Here is another view on the issue.

Tags: GNU GPL FSF

2006-07-02

New stuff...

Soon I'm traveling to Portugal to attend the ICDCS 2006 conference. Related to this, I have put some new stuff on core-dump, my web site.

Now I'm a bit in between.. Maintaining a blog at blogger, and putting "more serious" stuff at core-dump. I've been thinking to continue writing the blog at core-dump, but somehow it doesn't feel "right". I like to keep the distinction between "serious" and "not serious" stuff. Any comments on that?

Tags: rumblings

2006-06-29

Dangerous Javascript

This article, titled "Knowing the User's Every Move...", is worrying. From the abstract: "In this paper, we investigate how detailed tracking of user interaction can be monitored using standard web technologies." In short, they have developed some JavaScript code (which runs in Netscape, Konqueror/Safari, IE and Opera) as well as proxy which transparently injects that code into page HTML before it is delivered to the client. This code enables detailed tracking of users actions including mouse movements, clicks and key presses.

This is particularly worrysome, as this mechanism can very easily be abused. Moreover, the current controls in, for example, Opera 9 are very inadequate. If I disable Javascript, then I can't use advanced AJAX applications, such as Gmail. On the other hand, there is no possibility to have Javascript enabled only for "trusted sites" stored in some list, and administered by the user.

Tags: privacy javascript security browsers

2006-06-25

The (not always so) powerful valgrind

I don't think there's a respectable C programmer that hasn't heard about the valgrind tool for checking (among other things) memory access violations in a program. In a program that I'm writing, I was hitting an assertion failure where I shouldn't have had. Something lead my program to incosistent state, and I couldn't figure out what. It appeared seemingly random - usually a manifestation of some memory management problems. So I've run the program through valgrind, and - no errors (apart from those reported for the gethostbyname() function). With the help of hardware breakpoints in GDB, I've tracked down the problem to the following piece of code (roughly):


struct smth {
  int state;
  ...
  char buf[MAXBUF];
};
static struct smth a[16384];
...
struct smth *p;
...
p->buf[i] = 0;

At certain points in the code, the i variable was equal to MAXBUF, so it overwrote the state member of the next structure in the array. This is still within the bounds of the array, so valgrind didn't complain although it is a serious programming error.

I'm coding a user-level thread scheduler and using the makecontext() family of functions. This doesn't help either - the debugger gets very confused when trying to trace through such program. Apparently, it can't single-step over swapcontext() boundaries. So I had to put the hardware breakpoint on data change (for the state member) with additional condition that state is set to 0. I fixed the code by changing it to


p->buf[MAXBUF] = 0;

(in this case, this is correct, although not strictly equivalent to what was previously there).

Lesson: use assertions abundantly. Whenever you get an assertion failure, it's an indication that you have a wrong idea about your program's behaviour. Better to find that out sooner than later. And don't think that your program is error-free just because valgrind says so.

Tags: valgrind c programming debugging

2006-06-22

Intel's (foul) marketing

This page tries to show the superiority of Intel's latest processors over Opteron. Of course, the largest bar (= the best result) represents Intel's processor. The important fine-print about configuration details is well-hidden below. Namely the configuration with Xeon 5160 (best result) has:

64GB memory vs. Opteron's 32GB,
runs at 400MHz higher frequency

More fair comparison is the Xeon 5080 vs. Opteron. Namely, the difference in results is too small given the huge difference in processor frequencies - Xeon 5080 runs at 1.1GHz higher frequency than Opteron (Xeon@3.7GHz vs. Opteron@2.6GHz). Maybe the flashy graph is enough to convince managers in "superiority" of Intel's technology, but it didn't convince me.

Tags: AMD Intel Opteron Pentium

2006-06-08

Vesta: yet another source management tool

Has anyone experience with the Vesta Configuration Management System? Summary from the homepage: "Vesta is a portable SCM system targeted at supporting development of software systems of almost any size, from fairly small (under 10,000 source lines) to very large (10,000,000 source lines)."

Now, what really drew me to it is that it also automatically handles the build process (dependencies and other stuff that is simply tedious to do with plain make). Currently I'm using Subversion for source control, QMake to generate Makefiles, and GNU make to build my projects. QMake saves a lot of work, but an automated solution would be even better. Comments?

Tags: version control vesta vestasys make makefile qmake subversion

2006-06-03

Hosting - found!

Thanks to a friend, I now have the Subversion+Trac hosting for my project. I have caught some time to write basic information on Trac and to import the currently existing source. To repeat shortly: the project is to write a BSD-licensed replacement (p11scd) for the "standard" GnuPG smart-card daemon. p11scd shall work with PKCS#11 smart-cards. The project homepage is here.

If you are a competent C programmer and interested in the project, you are welcomed to join!

Tags: gnupg gpg cryptography pkcs

2006-05-30

Support the inhabitants of Travno

This post is a bit different from previous ones. This one is about a specific political issue and the arrogant attitude of the Catholic church towards inhabitants of Travno. Travno is a part of Zagreb, the capital of Croatia. There is a big residential building (largest in Croatia) inhabited by around 5000 people. The building has a large green park by its side, as you can see from the pictures on the link. The large picture on the frontpage says "Light the candle - here is where the democracy is dying".

Namely, the church insists on building the church in the middle of the park where many children play (thus, destroying about 1/3 of its area) despite the protests (one was held yesterday) of the Travno inhabitants, despite the fact that there is available "non-grass" space at the same distance, and despite the critique of many architects. Neither the politicians nor the church listen to their suggestions. They are are accused of being "anti-christians" and the leader of the inhabitants has even been threatened by death a few months before.

The inhabitants are not against the church building itself, they just don't want it to be in the middle of the park. And with so many large concrete buildings, this residential block needs the park. And very near there is available space for the church. However, neither the church nor the Zagreb's town council, currently lead by Milan Bandic (BTW, he has been caught for driving while being drunk), will listen to the will of the people. The democracy is truly dying in Travno. That park is really beautiful and I hope that politicians will at last start hearing the will of the people they are supposed to serve.

For those who understand croatian, this is the link to the news article.

Please help spread the word about this! The picture above is in support of inhabitants of Travno. Many bloggers have also organized themselves in support of the Travno inhabitants. The text on the picture says: "I plead bloggers to take this candle to their blogs. Let the world see that there still exist people who remember how does the grass look like, how does it feel to play on the lawn, who are against asphalt and concrete.. Let the world see how many of us there are!" At the bottom is "Action to save green surfaces", and in the bottom-left corner "Citizens' initiative".

Tags: croatia zagreb church

2006-05-28

Broken getrusage() in Linux

I was writing a program and wanted to measure both its execution time and used memory. For time measurement, getrusage(3) works just fine. However, it returned 0 for memory usage. A quote from the man page:
"The above struct was taken from BSD 4.3 Reno. Not all fields are meaningful under Linux. Right now (Linux 2.4, 2.6) only the fields ru_utime, ru_stime, ru_minflt, ru_majflt, and ru_nswap are maintained."
(This refers to struct rusage, filled in by the getrusage() function.)

The question is why? At least some of memory statistics that should be returned by getrusage() can already be obtained through /proc/*/stat.

Tags: kernel linux

2006-05-25

Free Trac + Subversion hosting?

A simple question: can someone point me to free(-ish) trac+subversion hosting? I will set up a public project to start community development of BSD-licensed smart-card daemon in order to replace SCD that comes bundled with gnupg2.

As you might remember, many posts ago I have been criticizing Werner Koch's policy about refusing to include PKCS#11 support in scd. I've received enough mails from other users agreeing that his policy is not reasonable w.r.t. this feature.

I'm working on this project solely in my free time (which is not abundant as it is, and often gets spent on other things). I hope to draw in about 2-3 more developers interested in low-level crypto programming so that we can jointly finish the task in our free time. Reward? None, except for the joy of programming and later fame :) [OK, it can also be a good reference in the future when looking for a job, and might even draw in some commercial funding in the future.]

Tags: subversion trac

2006-05-14

RedHat - a nice surprise

Recently I begun to work on a 64-bit machine with RHEL WS 4 installed. I was skeptical at first towards RH, but it turned out to be a nice experience. Installed packages (e.g. mysql, apache) have no automatic configuration at all. Defaults are reasonable. And they are not automatically put into all runlevels as in some other distributions. I don't have a feeling that I'm working "against" the system which I had with many other distributions.

The best part is how RedHat has become security-conscious. I already mentioned not enabling daemons by default after installation. Another is that the firewall is always active and passes through a select few ports. And there comes SELinux. The "targeted" policy that comes with RedHat is almost invisible; it contains few vulnerable daemons and lets other users do their work as usual. I noticed that SELinux is active because mysqld mysteriously reported EACCESS on its datadir, even though all permissions were correct and the directory was accessible when I made su to the mysql user. I moved the data directory from its default place /var/lib/mysql to /home/mysql. The new directory wasn't marked in the policy as accessible to mysqld so I had to fiddle a bit to fix that.

All in all - go for RedHat!

Tags: linux redhat

2006-05-11

Radio streaming over internet

It's incredible how many web radios make it hard to listen their program for users who don't have windows and don't want to mess with installing mplayer plugins into firefox or opera under linux. Take for example the Norwegian Radio 1 (something I "hacked" today). I had to manually wget several obscure pages (whose links I obtained by searching through the ugly HTML+javascript of the web interface) just in order to be able to execute the following command and listen to the radio: mplayer http://213.158.233.199:3004.

Why can't they simply publish the URL on the website? My guess is that there are two reasons for that:

The direct URL wouldn't mean much to most of the users.

Revenue from commercials displayed on the web site.

Tags: streaming linux

2006-04-30

The devil is in the details: assertions

I've been debugging my btree removal code for the last 2 days. Assertions (#include <assert.h>), together with suitable printf() debugging, have proven as an invaluable tool during the process.

The code is after the purely textual description of the BTree data structure in Knuth's volume 3. There he says that "removal is only slightly more complicated than insertion". That might (actually, is) so on a conceptual level. But, when translated into code: aroud 4x more code, around that much more time and a number of corner-cases.

I've run the code through valgrind and I have no errors. Wow! :)

Tags: TAOCP algorithms Knuth

2006-04-29

Linux performance counters

Performance counter drivers under linux are an interesting affair. There's oprofile, perfctr and perfmon2. Neither does exactly what I need in combination with Xeon, but put together they'd be an excellent tool.

Now: I want to measure L1 cache misses on Xeon. oprofile can't do it out of the box (one has to program some extra registers that oprofile doesn't seem to know about), perfctr has very weak user-mode tools (basically, it's just a library), and perfmon2 doesn't seem to support Xeon. perfmon2 seems the best designed project, compiles cleanly and reports it's initialized, but trying to start any user-mode program just says that the processor isn't supported. The author of perfctr has himself said on a mailing list that he's giving up on perfctr development in favor of perfmon2.

Has anyone managed to get perfmon2 working on Xeon/P4 (not Pentium M)? And how?

Tags: linux performance xeon oprofile perfctr perfmon2

2006-04-23

A neat trick

Recently I was reading Knuth's The Art of Computer Programming (vol. 3). The "uniform binary search" algorithm uses ceil(x/2) operation where x is an integer. A very short, branch-free way occured to me how to code this operation. The following code assumes x in %eax and finishes with the desired result in %eax


shrl $1, %eax
adc  $0, %eax

TAOCP assembler knuth

Airplane traveling

For my vacation I went to Croatia, taking Norwegian airlines from Oslo to Rijeka (and back). The flight from Oslo was more than half an hour late, and then the pilot had to wait extra 10 minutes because he missed the time-slot. And fritt setevalg is a bad idea: it takes people to board the plane much longer than when the seat is assigned upon check-in. Miraculously, the flight from Rijeka was in time and the plane arrived even a bit earlier :)

Disembarking from the plane is another story. People are simply too slow. Somewhere I've heard that the plane layout must be such so that it's possible to evacuate it in 1 minute (60 seconds). This seems impossible to me :)

During the 2:30 flight the following question came to my mind: does the long-distance flight route take Earth's rotation into account?

Tags: norwegian airplanes travel

2006-04-08

Vacation

Finally, 2 weeks of vacation! I won't be posting much I guess.

2006-04-02

GCC obscenity

Today, while debugging some of my code, I accidentaly noticed that in certain cases gcc "optimizes" printf() calls into calls to puts(). This is plainly unacceptable for at least two reasons:

printf() might have side-effects that puts() doesn't (f.ex. see this bug report).

It makes LD_PRELOAD interception of certain functions ineffective.

This "optimization" takes place even when I compile my program with the -ansi switch. I haven't been able to find a switch to turn off this "optimization". In addition, the generated code in this case is plain wrong. No optimization should change the semantics of the program (which it does in this case).

C has always been the language in the lines of do what I say. How many other such surprises are hidden in gcc?

What they have done with this "optimization" is on the level with (if not worse than) MS's adaptation of C++ for CIL. Strangely, so many people bark at MS for their "wrong-doings", but the bug report cited above never even got a response. Well, at least I'm going to say this: shame on you gcc developers!

Tags: gcc

2006-03-26

Root shell or no root shell?

I'm developing a program which has to run with root privileges (e.g. to be able to execute mlock()). It's boring to su to root shell all the time, so I opened a terminal with root constantly logged on. Sometimes I type faster then I think, so to protect myself from typing the wrong thing into the wrong window, I executed unset PATH in the root shell. Now the root can't "accidentaly" execute anything. :)

Tags: unix

2006-03-21

Applied algebra

Today I faced a problem: how to generate a pseudo-random permutation of numbers in range [0,2^128), such that each number appears exactly once. An obvious solution is to initialize an array A[i]=i and then shuffle elements with repeated calls to random(). However, that would use a large amount of memory, which is not acceptable in this particular case.

So I remembered a bit of algebra that I learned long time ago at the university. It's about cyclic groups. Consider the sequence k*a0 % M for k = 0, 1, .... In my case M=2^128. This sequence generates all numbers less than M exactly once, provided that a0 and M are relatively prime. Shortly, a0 is a generator of an additive cyclic group where the set of elements are natural numbers in the range G=[0,2^128), and the operation is addition modulo M.

The code is even simpler (pseudo-C):


a = 0;
do {
  /* use a in some way */
  a += a0;
  if(a > M) a -= M;
} while(a);

When a becomes 0 again, the whole group of M elements has been generated, so the loop exits. In other words, a has cycled through all elements in G exactly once. This is also connected with pseudo-random number generators. I'm not interested in the exact statistical properties of generated numbers; the only requirement is that consecutive numbers are "far enough" apart (which I define by chosing an appropriate constant a0).

Tags: algebra random numbers

2006-03-19

Fractals!

I had the opportunity to talk on IRC with the programmer of this fractal generator. In the course of discussion, I came to the following idea: program the GPU to calculate mandelbrot zoom animation in real-time and set that animation as the desktop wallpaper. It'd be the coolest desktop ever seen (IMHO, much cooler than the current xgl hype). I believe that current GPUs are powerful enough to perform this task.

Of course, the most fun part would be GPU programming.

Tags: fractals gpu

2006-03-18

GPL vs. the world

After having participated in a debate around GPL, I decided that it's about the time to change licensing on two of my SW projects: hashed text utilities, and secure password generator. I have changed it from the GPL to the MIT/X license. I can't change licensing of the Raster Alchemy project since it was GPL'd to begin with.

I don't agree with the concept of "freedom" that GPL advocates are spreading around. There is no freedom where exists a must. (I'm referring to the viral nature of the GPL). An interesting view on the matter is this article.

Tags: gpl mpl

2006-03-15

Decent calculator

For a long time i have been looking for a decent scientific calculator to use on my computer. Apart from the ordinary capabilities, I need binary arithmetic (binary, decimal, octal and hexadecimal input/output) with usual logic operators. And it'd be nice if it were stack-oriented (RPN, as HP48 calculators).

Today I was programming something and I wished I had my HP48 with me. I couldn't find anything decent on the web, so I set to try the emacs calc. And there it was, with all the features I needed.

Tags: emacs hp48 rpn

2006-03-14

Tea in coffee

Have you ever wondered about the result of putting a teabag (and letting it stay for a while) into hot coffee? How would it taste like?

Tags: coffee tea

2006-03-12

On commercial SW

I have discovered a down-side of commercial software: I have an installation of Mathematica 5.2. After recompiling my Linux kernel (I just added the ethernet driver in the kernel), my "machine ID" changed and I had to request a new license from Wolfram Research. It was relatively painless to get a new license code (this time!). At best, this relicensing is a minor nuisance, but at its worst, the Wolfram research might get suspicious and make it very hard for me to get a new license in the future (if I do it too often). So now it seems that I'm stuck with Linux 2.6.12 kernel in its current configuration.

Does anyone have more experience with Wolfram and "system transfer requests"?

Tags: Mathematica

2006-03-09

TODO lists

Based on an admittedly small observation sample, I think that people who maintain TODO lists have much less "free time" than people who don't. Some even maintain their TODO lists in the lack of other work.

Personally, I don't maintain a TODO list, and I manage to finish all my work in time, and there's also some time left to have fun :) I do maintain a (small) text file named "TODO", but it contains very low-priority stuff, which can be done anytime (tomorrow or in a year; doesn't matter). And all of the things there are longer-lasting things (e.g. reading a book). It should really be called "REMINDERS" instead of "TODO, but "TODO" is shorter :).

Your opinion?

Tags: TODO

2006-03-05

A bit on Java

I've read this excellent short essay. The thing that I disagree with is the following quote praising Java: "No more bounds errors, no more core dumps." This is a moot point after I've seen many Java programs printing NullPointerException on their stderr and continuing to run.

The key question is: how can you be sure that the program recovered to internally consistent state after such exception? Printing of such exception clearly indicates a bug somewhere. I'd rather have such a program crash (the sooner the better) than continue to run and producing possibly garbage results, and maybe storing them in some files or in database.

What do you think?

Tags: java

2006-03-02

"Smart" x86 design

Modern x86 processors (Athlon, Pentium4) have split L1 data and instruction caches. What I'm trying to accomplish is that instructions are fetched from the L1 cache (the code is small enough to fit in), but that data fetches bypass the cache hierarchy and directly access memory. Even the code currently being executed from the L1 instruction cache should be read directly from memory if accessed by a memory-referencing instruction. (There are no writes, just reads).

This seems impossible to achieve on the AMD/Intel architecture. I tried several combinations with CR0.CD and MTRR settings and I can have one of the two extremes: 1) both code and data are served from the cache if found there, or 2) both code and data read directly from memory (and this includes every instruction executed).

How smart do you have to be to split L1 I&D caches, but not to provide separate control over them?

Tags: amd assembler intel

2006-02-26

Public message to mr. Fabrice Bellard, author of qemu

Mr. Bellard, I think that I have found a bug in the version 0.8.0 of qemu. I'm testing a custom kernel and when qemu encounters the clflush instruction it just hangs. Contact me if you are interested in details. I guess you can find my email address easier than I could find yours.

Rationale: I'm sick and tired of damned forums and mailing lists. Mr. Bellard didn't leave his contact details anywhere on qemu's web site or in the source files. I wanted to report this bug, but it seems that I have to post it on some qemu forum/mailing list for which, of course, I have to first register. I'm sick and tired of those bloody registrations which I use only once and never again. So if he sees this message OK, if not, well.. his problem.

Oh yes, please, if some of you are subscribed to the qemu's forum or mailing list, I would appreciate it if you draw mr. Bellard's attention to this post.

Tags: qemu

IT Underground report

Yesterday afternoon I returned from the IT Underground conference in Prague, where I was an invited speaker. I gave a talk on the possible exploatations of smart-cards, and, well, I amazed myself. I had only 16 slides and was a bit worried what am I going to talk about during the two hours planned in the agenda. In the end I had no problem talking and also gave a small demonstration of stealing data from applications.

The conference was very well organized - kudos to the orgranizers. It was held in the Hotel STEP, in the outer part of the town. The hotel was new, modern and pleasant to stay in (except that shower cabins in the bathrooms were a bit small). The hotel was within 10 mins of walking to the nearest metro station, and with Prague's excellent public transport system, it was easy to get to the centre of the city to look around a bit and taste Czech excellent cuisine and beer (for those who will visit Prague: garlic soup (nb! NOT onion soup) is a must to try). Prague is a beautiful city, and I was a bit nostalgic when I had to leave. And extremely cheap, compared Oslo. I could get used to living there very quickly.

To me (and to the most of the audience, I believe) the most impressive lecture was Shawn Merdinger's on vulnerabilities of VOIP phones. I couldn't believe how vulnerable those phones are. Shawn investigated 11 different phones so far, and all of them had some security flaw - either open HTTP, remote debugging, telnet shell, etc. He gave a live demonstration of telnetting onto a VOIP phoe, getting a shell and instructing it to make a phone call to another number. When you answer the other phone, you can listen to whatever the first phone transmits. An ideal spying device! Passwords? What passwords? Except maybe some default and well-known ones.

I made contacts, learned some interesting stuff, and had fun with some cool people. To put it shortly: it was great :)

2006-02-15

Linux is stupid..

I'm not saying that other OS'es (*BSD) aren't in this respect since I haven't tried, but here goes my annoyance.

I'm developing a kernel bootable by grub. The floppy image file is formatted to FAT32 and mounted on some directory. When I want to test a new version of the kernel, I copy the kernel executable file to the directory where the floppy image is mounted and run bochs. However, the change is not propagated to the underlying floppy image file! Bochs still runs the old kernel version. I have to explicitly type "sync" in order for the change to be flushed to the floppy image file.

Doesn't linux globally maintain data consistency? Are its FS buffers per filesystem? Why is it done in such a stupid way?

Gmail chat

Heh, when gmail chat is enabled for your account, it first asks you whether you want to have saved and searchable chat histories. I have answered no. BUT.. Should we trust them that they really respect our choice? They might still collect all our personal chats, but just not displaying them. What do you think? Do they really respect our privacy or just pretend to do so?

Tags: google

2006-02-14

A time for...

I've discovered when I should clip my fingernails: when I start to accidentaly injure myself to the blood. Nothing serious, but still annoying. This triviality reminded me of an old song (could be 60-ies style, as far as I remember the music) whose refrain had verses in the lines of "a time to love", "a time to breathe",.. I can't remember exactly. I'm kind of fond of the song, so can anyone help me find the song?

While I was writing this text, I remembered the key phrase by which I found the song on the Google. It's Wilson Phillips: "Turn! Turn! Turn! (To Everything There Is A Season)".

What turned out to be more surprising is that the song is heavily based on Ecclesiastes 3:1-8. The original text is reproduced here.

For some reason, this song moves me...

Tags: lyrics

2006-02-08

A simple puzzle

A short post after a loong pause. This means that I'm very busy with other stuff.

I am attending a course about formal verification of systems. The book we use in the course gives an example of the "best" (as claimed by the author) way to parallelize a simple task on 2 CPUs: finding the maximum element in a vector of n elements. The problem with their proposed solution is that it is, IMO, far from the "best". I took me only few moments to think up a solution simpler than the one in the book.

The puzzle for my readers: suggest a way to parallelize the above-mentioned task to 2 CPUs. If you want a bigger challenge, generalize to k CPUs.

2006-01-28

Mathematica

Recently I've solved quite a number of programming challenges on the project euler site. Most of them I have solved using Wolfram's Mathematica. I am simply amazed how well-designed and easy to use it is as a programming tool. Rich mixture of functional programming tools, pattern matching and "classical" procedural programming. Excellent GUI, marvelous documentation, and it's even pretty fast in execution.

Apart from being an excellent programming language, it's also the best numeric/symbolic/visualization tool available. I have also worked with Matlab and MathCad, and they are below mathematica in almost every respect.

Ok, now for the flaws from the programming perspective. I can see only one - no (explicit) support for lazy lists. I.e. if you want to generate a list of odd numbers less than 100, you can write


Select[Table[i,{i,100}], OddQ]

(ok, this is suboptimal since odd numbers can be generated by 2*i-1 for suitable range of i). I can't know how does it work internally but my guess (based on the time and memory consumption in much larger example) is that it first generates a complete list of the first 100 numbers (the inner Table) command and then filters out even numbers. A better way would be to "streamline" the application of Select and OddQ while the list is generated. Haskell works this way, as far as I understood from little reading about it.

Tags: Mathematica Haskell

2006-01-17

Programming challenges

These days I'm not doing much coding, but I miss it. I can't engage myself in some long-term project (too busy with my PhD), but I like solving puzzles in my free time, esp. programming ones. The ACM competition problems require too much time, effort and training. Today I have stumbled upon the Project Euler. Nice collection of math-related programming challenges of various difficulty. I have immediately registered!

Tags: programming mathematics

2006-01-16

The sad state of free software

I need to draw a diagram. Few boxes with text inside connected with arrows also having some text on top of them.

Try 1: OpenOffice draw, version 2.0. It comes pretty close to what I need except for one very annoying feature: I can't control the placement of labels over the connectors. Which wouldn't be so bad if the text were placed intelligently. But it's not. If I have an L-shaped line with some label, the text is not over the horizontal or the vertical part. Instead, it is in the middle of the virtual rectangle obtained by filling out the opposite parallel lines of L. Or, if the connector is completely horizontal, it strikes through the text (the text should be obviously placed above the line).

Try 2: Someone suggested dia. I drag a predefined shape (square) on the drawing area, double-click it in order to put the label inside, and what happens? NOTHING! Thank you, I'm not going to manually select the text tool, enter text, center it over the shape and group it. For each box. What's worse, dia crashes when run as an X11 application over the network.

Try 3: I connect to the Windows 2003 terminal server installed at my institution, launch Microsoft Visio and continue my work from there.

My needs are simple, yet two popular free tools for drawing diagrams aren't able to satisfy them "out of the box", if at all.

2006-01-12

Perl, python, security and vision

A friend has pointed me today towards a very cool project named logix. In short, it is a programming language with extensible syntax and macros. Ideal for defining domain-specific languages. I was awed when I saw what it is. It strongly reminds me of the power of LISP macros. And I didn't even look into much detail. Currently, they have python as their back-end but they are considering to switch in the long-term. A quote from their "Future Work" section: "Efficiency and security are two areas that are somewhat lacking with a Python foundation." I think that, in the end, they will make a better Python than Python :)

These people seem to have a *vision* of what they want to achieve. The perl folks also have their vision: the Perl6 language and the Parrot VM. I feel that the Python world is lacking its vision. Ok, there is "Python 3000", but unlike Perl, or Logix, it is vapourware, and AFAIK, not a line of code has been written. To build on the Python community's "batteries included" metaphore: putting a more powerful engine and more batteries in the car will make it go faster, but it won't make it *fly*.

They are also pretty conservative with the language, and this shows, for example, in their *disinterest* for the maintenace of the restricted execution module. Which brings me back to Perl: Ben Laurie wanted to implement capabilities in the Python language to provide fine-grained control over execution of the untrusted code. He says that the python community was not interested, so he implemented it for perl, thus resulting in CaPerl. Python doesn't even implement something akin to Perl's *tainting*.

Today, engineers and programmers are slowly becoming aware of ever-increasing security issues. Slowly, they are realizing that security has to be *built-in* the system, and not an add-on. It has to be *pervasive* and the best place to make it pervasive is *the runtime system itself*. I believe that today with, without such security features, it is hard to take an interpreted/VM language in serious consideration for coding Internet applications. I believe that this disinterest in security will harm Python (and other such languages!) in the long run, at least for Internet applications. One of the major advantage of executing the code in VM is easy restriction on what the code is or is not allowed to do. And python folks willfuly do not exploit this advantage. And they have much to learn, for example from Java. I don't like Java for many things, but I have to admit that it has well-engineered restricted execution model.

Another good idea that for some reason never caught on in the python world is the Stackless Python. Despite its obvious advantages, they went for the much weaker alternative in the form of generators.

Conclusion? Python reminds me of Latin. Nice, well-structured language with fast syntax and rules (also with few ugly exceptions!), unambiguous and expressive.. and too slowly changing to meet modern demands. Today, Latin is effectively a dead language, save for few niche areas (the catholic church). Will Python live to the same fate? Maybe IronPython, now backed by Microsoft, will introduce some innovation in the Python world.

Tags: perl python security

2006-01-10

NetBSD again..

It is a wonderful system, but it does not support all the latest features. For example, one can't use HW performance counters on SMP kernels even in the new NetBSD 3. The FBSD6 now also supports fine-grained locking in the kernel, another feature that NetBSD is lacking (so it works less than optimal on SMP machines). Therefore, I'll probably switch to FreeBSD for my research purposes.

Tags: FreeBSD NetBSD BSD

2006-01-09

Java sucks

For many reasons. Too many to list them all here. Instead, I'm pointing you to the following article: The Perils of JavaSchools. The author has so nicely put all the main points I dislike Java for.

Tags: Java

2006-01-08

Hacker's Delight

This is an excellent book of computer arithmetic tricks. Recently I have found that it has a dedicated web site. The site has a revision text for the possible 2nd edition, and links to other sites with tematics related to the book. There you can also find a very interesting (and non-trivial!) "computist quiz". It is quite some food for thought!

Tags: books computers hacking quiz

2006-01-04

Sudoku solver

Finally, I have added the "original" sudoku solver to my collection. Again, it is a brute-force method, but very efficient.

Tags: sudoku