Friday, June 10, 2011

gEDA and Guile — safe handling of non-local exits

This is the third in a series of blog posts on extensibility in gEDA using Guile Scheme.

  1. Finding Scheme API code in gEDA
  2. Compiling against multiple Guile versions
  3. Safe handling of non-local exits
  4. Dealing with deprecated libguile functions
  5. Checking arguments to Scheme functions in C
  6. How and when to use Scheme errors
  7. Reducing boilerplate with "snarfing macros"
  8. Opportunities to get involved

In this post I'll be talking about an issue which the current gEDA codebase doesn't deal with very well: dealing with non-local exits in C code that uses libguile.

What are non-local exits?

Non-local exits are a way for execution to jump out of the current execution context to another point in the program. The only way of doing this in C is through the quite limited setjmp/longjmp mechanism .Guile supports two fully-featured constructs for non-local control flow: exceptions and continuations.

Exceptions will be fairly familiar to many developers, as they are heavily used in languages such as Java and Python. When an exception is thrown, the stack is unwound until an exception handler is found for the exception. If none is found, then the program will usually exit with an uncaught exception error. Guile uses exceptions to implement its error handling mechanism, and you can read about that in more detail in the Guile manual (Exceptions, Error Reporting).

The other way that a non-local exit might occur is via a continuation. These are pretty hard to get your head around, unfortunately. The Guile manual describes them as follows:

A “continuation” is the code that will execute when a given function or expression returns. For example, consider

(define (foo)
  (display "hello\n")
  (display (bar)) (newline)
  (exit))

The continuation from the call to bar comprises a display of the value returned, a newline and an exit. This can be expressed as a function of one argument.

(lambda (r)
  (display r) (newline)
  (exit))

In Scheme, continuations are represented as special procedures just like this. The special property is that when a continuation is called it abandons the current program location and jumps directly to that represented by the continuation.

A continuation is like a dynamic label, capturing at run-time a point in program execution, including all the nested calls that have lead to it (or rather the code that will execute when those calls return).

Guile implements continuations that are usable from C code with some pretty crazy techniques for saving and restoring the C stack.

The important point to take away from this is that when you call a Scheme function from C, you cannot depend on it returning normally. That means that totally legitimate resource management in a normal C/GLib program, such as:

void myfunc()
{
  gchar *buf = g_strdup ("const string");
  anotherfunc (buf);
  g_free (buf);
}

becomes unsafe if anotherfunc() calls Scheme code that might either call a continuation or throw an exception, since in that case the call to g_free() would never occur and the memory assigned to buf will be leaked.

In gEDA there are several places which do exactly this, and they need fixing.

Dealing with non-local exits safely

Fortunately, Guile provides two mechanisms for dealing with this, called Continuation Barriers and Dynamic Wind.

Continuation barriers are the less powerful of the two mechanisms, but are simpler to understand. They simply block both exceptions and continuations from entering or leaving the context of the barrier call. Here is how to modify the previous example to use a continuation barrier:

void myfunc()
{
  gchar *buf = g_strdup ("const string");
  scm_c_with_continuation_barrier (anotherfunc, buf);
  g_free (buf);
}

There's an obvious disadvantage to this approach: you can't catch and handle exceptions which occur, or allow exceptions to propagate upward to the function that called myfunc() in case they can be caught and handled there!

Dynamic wind provides a more flexible method. The scm_dynwind_start() and scm_dynwind_end() functions delimit a dynamic extent, and it's possible to register actions to be carried out whenever the dynamic extent is entered or left (e.g. due to a continuation or exception). It's also possible to indicate that non-local exits from the dynamic extent are permitted, but non-local entries aren't (this is the usual thing to do when using dynamic wind from C code).

Here is myfunc() modified to use dynamic wind:

void myfunc()
{
  /* Begin a dynamic extent that can't be re-entered */
  scm_dynwind_begin (0);

  gchar *buf = g_strdup ("const string");

  /* Make sure that buf is freed when the dynamic extent is
   * left either locally or non-locally */
  scm_dynwind_unwind_handler (g_free, buf, SCM_F_WIND_EXPLICITLY);

  anotherfunc (buf);

  /* End dynamic extent */
  scm_dynwind_end ();
}

This ensures that buf is always freed. If anotherfunc() raises an exception, or it calls a continuation that jumps out of myfunc(), the unwind handler will make sure that the string is freed. Since the SCM_F_WIND_EXPLICITLY flag was passed to scm_dynwind_unwind_handler(), the handler will also be called if the call to scm_dynwind_end() is reached and the function returns normally.

Conclusion

One task that's well past due is a review of functions in gEDA that call into Scheme to make sure that they safely handle non-local exits, either by using continuation barriers, dynamic wind, or the "protected" functions provided in libgeda (see libgeda/src/g_basic.c). This would be ideal for someone less familiar with the code base to tackle, since it can be approached on a function-by-function basis.

In my next post, I will describe how to set up your environment to get notifications about use of deprecated libguile API usage in gEDA, and in particular the problems caused by the continued use of the long-deprecated SCM_STRING_CHARS macro.

No comments: