2007-04-10

Linux signal handling is broken

Enter sigaction() in combination with SA_SIGINFO flag. Such signal handler accepts three arguments, the third being the context (full machine state needed to resume it, eg. registers) of the interrupted thread.

First problem: linux ABI is broken. The FP state in the uc_mcontext member of the ucontext_t structure is pointer, instead of value. This makes copying of the context nontrivial.

Second problem: You can't use setcontext() to leave signal handler and jump into another, previously saved, context. (Or, for that matter, you can't use it to return to the very same context passed as argument to the signal handler.) In other words, signal handler like
static void sighandler(
  int signo, siginfo_t *psi, void *pv)
{
  memcpy(puc_old, pv, sizeof(ucontext_t));
  /* choose another context to dispatch */
  setcontext(puc_another);
}
does not work. It does not restore signal mask specified in the puc_other, does not reestablish alternate signal stack, etc. However, this scheme works flawlessly on Solaris.

How am I fixing it on Linux? I walk the stack frames (following the saved stack frame pointer), modify the return address so that the signal handler returns to itself instead to the interrupted context, etc. Very ugly and nonportable.

Not to mention that I'm relying on luck: it seems that, under current linux kernel, it is not possible to atomically restore signal mask and return from signal handler to context other than the immediately interrupted one. (Heck, it's not even possible to do it nonatomically without resorting to "black magic" involving reading hex dumps of stack frames.)

I'm installing Solaris in a virtual machine to try it out, and I'm seriously considering to move my development to Solaris.

No comments: