Monitors are broken

This post is more of a rant...

I think that the current wide-spread programming "abstractions" for concurrency (locks, condition variables) are flawed, insufficient, error-prone, whatever you like. It's easy to get into deadlock, the programmer has to be meticulous in his decisions which functions should be synchronized with respect to which objects, etc. No, I don't view Java's synchronized keyword as something good and well-designed.. It's basically still the same paradigm with more details hidden away..

My biggest objection are monitors: the name is most often mentioned in conjunction with condition variables. Looking simple on the surface, it can sometimes have surprising semantics: for example the pthread_cond_wait atomically unlocks and blocks the thread. When the thread is woken up, the mutex associated with the condition variable is reacquired. However, not atomically after unblocking. So you have to put your condition wait in a while loop and re-test on each awakening whether the condition is true. And pthread_cond_signal delivers the signal to an unspecified thread so it's very easy to have starvation.

Is it possible to make a paradigm shift in this area? Does concurrent programming really have to be so error-prone? At least some guys are trying; take a look at another project of Microsoft Research for a possible design of novel concurrency primitives.: Polyphonic C#. A quote from the online introduction: "... polyphonic methods to not lock the whole object and are not executed with 'monitor semantics'". Yaay, no monitors! :)


Python scopes, again

A friend drove my attention to the following links: this, and this. It seems that I'm not the only one annoyed by this misfeature. The second link is really worth reading as it describes the fundamental problem: operator = being used both for creating a new binding and mutating an existing one.


Python global stupidity

Consider the following seemingly trivial piece of python code, directly pasted from the interpreter:

>>> x=0
>>> def f():
... x+=3
... print x
>>> f()
Traceback (most recent call last):
File "", line 1, in ?
File "", line 2, in f
UnboundLocalError: local variable 'x' referenced before assignment
Yes, thank you, I know that global variables are "evil". I don't care when I'm writing a throw-away program.

My biggest objection to python while I was learning it was strange and unintuitive rule about scoping. And even now, when I'm pretty experienced in Python, I get bitten by it sometimes. Here's how to fix the upper piece of code: add global x just before the assignment addition. Strangely, the following piece of code works:

>>> def g():
... print x
>>> g()
Personally, I think that this is both unexpected and broken behaviour. This is not how lexical scoping (present in all modern programming languages) works. And the more I follow on the python development, the more hackish it is becoming.

If you want a nice, small, and elegant language to embed in your application for scripting, take a look at Lua.


Eclipse IDE

I'm trying out eclipse as IDE for C and C++ development. So far, I'm happy. There are several reasons why I've given up on emacs+gdb+xterm combination, in order of importance:
  1. I am tired of manual build management. Make, SCons, whatever, are all equally tedious to set up and use properly. I'm not doing it anymore.

  2. There are other C/C++ IDEs, such as Anjuta2[1]. I did not bother looking into them as I have another reason for trying out Eclipse: to become productive with it. I have a project in mind, the development of which is most easily done in Eclipse. Yes, in the (not-so-)short future, I'm going to program in Java although it sucks big time. What on - it's a secret for now.

  3. emacs is a pain to set up and use properly. Here is a brief example: Many of emacs's key combinations are too verbose for my taste, esp. a feature that I'm frequently using: marks (VI: mx, 'x in command mode). So I'm using viper, the VI emulation mode. However, I didn't manage to convince emacs so that a) viper turns on itself automatically upon emacs startup, and b) sets autoindent mode (:set ai). Everything is set in config files, but it just doesn't have any effect. And I certainly know that it's possible to do accomplish it, as I stumbled upon it before in the past.

Basically, I've gotten tired of wasting time on trivia like makefiles and programming the emacs editor.. Which in the end I never did learn how to program.

So far, I'm happy with eclipse. Although written in Java, it is pretty responsive. It does basic project management for me, has integration with subversion (although, not perfect, there are cases where I have to resort to console..), at last I'm experiencing the benefits of automatic struct field completions on Linux[2] (it's a great help not having to refer to documentation or my own header files to remember how some field is called), etc..

[1] As for Anjuta2, I got discouraged by this web page. Quote: "It currently provides an editor with syntax hilighting and a rudimentary project management system." Doesn't seem worth to try it out. This short summary sounds like emacs/vi + some project management. How did I stubmle upon that page instead of the project's home page? It's the first page listed on google when you search for "anjuta2".

[2] I've done a console program for Win32 as a part-time job. Then I was using Visual Studio.NET, and I'm happy with that IDE too. VS6 is, IMO, crap, but VS.NET.. is an excellent IDE. All that I have been annoyed at in VS6, has been fixed in VS.NET. And added features that really made my life easy. I felt that IDE was actually helping me, as opposed to VS6 where I felt that it fought against me.


Security, Microsoft, WinNT, Java, .NET, etc.

A bit of everything in this post. Yesterday and today I have given lectures on the university about file system and computer security. In the FS lecture I have mentioned NTFS, while in the security lectures I've talked a bit about the Windows NT security model and NT kernel. I have openly said in front of students that as much as I don't like Microsoft, I think that NT kernel and NTFS are well-designed. The NT kernel itself has a VMS heritage, and VMS is known as one of the most secure systems. So what happened to Windows?

Win32 API happened. Not many people know that the Win32 API is just a layer over the NT kernel which is mostly undocumented. Applications call into the Win32 API which in turn then makes a series of calls into the NT kernel. I have read a bit about the NT kernel in the Tannenbaum's operating systems book, a bit on the internet and I have to admit that it has a really good design. You can run UNIX environment on top of it and it runs an OS/2 emulation layer. Microsoft now gives away for free their SFU - Services for Unix - package. High-quality POSIX API implementation, korn shell, utilities, etc. You also have an option to install gcc! OK, there are drawbacks as well - not having proper support for position-independent code as ELF binaries have, but it could be fixed. It's a "feature" of Win32 PECOFF loader (A consequence of this is that DLL must be relocated if it can't be loaded into its load virtual address. Relocation means patching machine instructions, so you lose the advantage of sharing code pages. Suddenly you have a new, very similar, copy of the code which already is in memory.).

Although I'm a UNIX user, I have to admit that I admire the NT kernel, its security model, its object-based design and the NTFS file system. I have been unjust when I said to student that I don't like Microsoft. What I actually don't like is:
  • From the programmer's perspective, the brain-damaged Win32 API.

  • From the user's perspective, the idiotic GUI which makes it next to impossible to perform tasks efficiently.

For these reasons I currently avoid Windows in my professional and hobbys work as much as possible. Currently, it is operating system for lamers, not for scientists and programmers.

It would be fun for me if I could get the raw NT kernel, few basic device drivers, a decent shell and documentation. And then to start exploring its capabilities. To build a novel operating system on top of it. I think that it is general enough to support any kind of application area.

But things are changing in the Win32 world. I'm closely following up what is happening with .NET, and I like what I'm seeing. Look for example at the preview of features for C#3.0. Then there is Monad, their novel shell based on objects. It seems that they are striving to take over the UNIX sysadmin base and I think that they will succeed eventually. Heck, they managed to warm me up.

Now, Java. It is a brain-damaged language and platform. I sincerely hope that MS manages to kill it in the long run. I wonder why they still support running Java applications on their OS, since Java is their direct competitor. My prediction is that Java will run terribly slowly in Longhorn (even more so than now.. they will never refuse to run Java programs and VM, but it will run very slowly compared to .NET. Even now it is slower according to some benchmarks.)

There are other interesting projects from Microsoft Research, in programming languages, operating systems, etc. Check out their web site. Too bad that MS has already created an "evil" perception of themselves so that people are automatically skeptical to good things coming out of MS research.


DRM could help privacy

Most people think that DRM is evil, and I mostly agree. The main argument against DRM is that it could take the control out of the computer owner's hands. But I think that there is one good thing that could come out of it: the creator of the content decides in what ways is it allowed to use it or not. The "only" problem with this is where the control of the use of the content stops and control of the computer begins.

So what happened that made me wish have DRM? I have sent a mail to someone, and that person put another one into CC: when replying. I didn't want the mail to be CC'd or BCC'd. It was supposed to be a private mail. But there is no way that I can enforce it. I did not even dare to ask the original recipient NOT to put anyone in CC: when replying, because I know him. He's pretty sloppy and he could have forgotten to delete the relevant parts of the original message (where I ask for it not to be CC'd) in addition to CC:-ing the reply, which would make things very bad.

OK, this time no damage was done, because I half-expected up front that the recipient will CC his reply, so I was careful about what I was writing. But the question remains - how can you, with the current technology, enforce such policies and make sure that the private communication remains private? Currently, the written communication in any form (esp. digital, like emails) is, most often, a silent agreement of parties. The fact is that once the document leaves the creator's supervision, receiving parties can do anything they want with it.

Most of the time silent agreements about privacy work well. I think it is so because of the mutual interest that all parties have in keeping the communication private. Sometimes, as happened to me, they fail, i.e. there is a discrepancy between communicating parties in what should be held private, or a difference in perceived mutual interest.

If I had some kind of DRM, I could just mark my mail as "for your eyes only" and forget about the problem. I'm interested in other opinions - how do you (try to) solve this problem?

The opera problem is fixed!

Yaay! I can now go back to using Opera for posting to Blogger.


Disk encryption

I've set up my gentoo linux to use disk encryption in the following cases:
  • encrypted swap using random key on each boot
  • encrypted /tmp using random key on each boot
  • encrypted disk partition for sensitive data
I'm trying out BestCrypt from Jetico Software, a Finnish company. First I thought of using cryptoloop or dm-crypt, but they have some security weaknesses. Kernels >= 2.6.10 include some stronger IV modes, according to the author of the patch. Does anyone know some other analysis of this new encryption mode?

On the other hand, I didn't find any formal analysis of BestCrypt either. But it also doesn't have any published weaknesses. It is a commercial piece of software, not very expensive, with two bonus features:
  • hidden encrypted containers, and
  • encrypted containers are readable on Win32.
The latter feature might come in handy. BestCrypt is also very user-friendly, flexible and almost trivial to use. I think it is worth its price.

Ah yes, my threat model, and why have I decided now to use disk encryption? Soon I'm going to travel around with my laptop and I don't want my work and personal data available to strangers if it gets stolen. So I'm moving all of my mails, work and private stuff to the encrypted partition. And all of the dot-files in my home directory. It's surprising how much data can be found there.


Blogger opera problems: update

I just received a reply from the user support. They say that the problem is known and that they are working on fixing it. So, wait and see a little bit longer what happens.


Blogger ate my post!

Yesterday I was typing my post, and blogger ate it! Well, almost. I had to stop my typing so I saved the text as draft. Usually I copy the text from the browser into some local scratch file before submitting it. Yesterday I forgot to do that. When I returned to continue editing the draft - no text was there! Today I tried to make a test post, with the same result. All of this I did using the Opera browser.

So I tried to publish something using Firefox - and it worked. It wouldn't be so surprising if I had not published 3-4 posts previously with Opera.

The conclusion - Google/Blogger made some changes in the meantime, and they've caused Opera to be unusable for publishing blogs! Time to contact them, if possible.


Another NetBSD kernel crash

It's a bit ironic to start getting frequent crashes after donating some money to the NetBSD project :) Seriously, I'm very happy with NetBSD (regardless of it rarely crashing in weird circumstances) and I'm eagerly waiting for the 3.0 release, so.. what the heck. I seriously think that the NetBSD is doing an excellent job and they deserve support.

Now, about the crash. Find the details here.


Linguistics part 2: tenses in indirect speech

I've learned another easy to learn, but to me unintuitive, "feature" of the Norwegian language. It concerns the indirect speech. I've been told that it also works this way in English, so I'm going to describe it on an english example. Consider the following sentences:

  1. Each time I ask her something, she answers: "I don't know."

  2. Yesterday I asked her something and she answered: "I don't know."

  3. Yesterday I asked her something and she answered: "I didn't know."
and the following indirect speech sentences:

  1. Each time I ask her something, she answers that she doesn't know.

  2. Yesterday I asked her something, and she answered that she didn't know.

  3. Yesterday I asked her something, and she answered that she doesn't know.

The 1st and 2nd sentence in direct speech translate to 1st and 2nd in the indirect speech. So, where is the stumbling stone? Well, according to "my" logic the correspondence should be 1(direct)-1(indirect), 2-3, 3-2. In the Croatian language, changing the tense in the reported speech, changes the semantics of what the person really said.

My thinking is not quite in line with the "correct" 1-1, 2-2 with 3(indirect) being nonsense as it is currently written (to make it correct, it should be changed to some "more past" tense - "hadn't known"?). This is yet another easy rule that I can accept, but it isn't logical to me (i.e. it is not in line how it works in Croatian).

I can't know whether I'll write more linguistic themes, but I've learned the following: your 1st spoken language deeply models the way you think. I might become a very good user of some foreign language not related to Slavic languages, like Norwegian or English, but I strongly doubt that I will ever really understand it.

And a small joke at the end: What is the most spoken language in the world? Answer: English, as spoken by foreigners.