Thursday, June 30, 2011

Scheme API merge, and keybinding in gschem

Earlier this week I merged the gEDA Scheme API that I've been working on since late 2009 into the main gEDA unstable branch. It's been a pretty great chunk of work, and I'm glad to have been able to get it into a state where it's generally useful. It's been pretty fun working out how to answer users' feature requests on gEDA-user with a short Scheme function, too! Hopefully people will take this new feature and do cool things with it.

One of the main obstacles to doing cool things with the new Scheme API is that it's difficult to modify key bindings on the fly — gschem doesn't have any equivalent to Emacs' global-set-key, for example. That makes it hard for e.g. an rc file or Scheme plugin loaded after system-gschemrc to add new keybindings or modify existing ones. There's also only a single keymap, so there's no possibility for user-modifiable state-specific keymaps (e.g. for binding keys that the user presses while placing a symbol).

While on the subject of keybindings, the way gschem currently displays keybindings in the menu bar is a little bit rubbish — for example, it would be good to display "]" instead of "bracketright", and "o S" instead of "oShift S". It's also a shame that we don't support a wider range of modifier keys, including "super" ("⊞" on most keyboards).

I'm planning to fix all these issues by adding a comprehensive set of Scheme functions for working with keys, key sequences and key bindings to gschem. Keymaps will probably become a hashtable-based record type, and there will be functions for creating a key sequence from a string and vice versa. There will also be a function for obtaining a "pretty" string describing a key, suitable for displaying in a menu or in the status line.

Saturday, June 25, 2011

The gEDA Scheme API manual is here

I'm currently working to get my guile-scheme-api branch ready to merge into the main unstable gEDA branch (master). One of the main jobs that needed to get done before that's possible was to write a reference manual for all of the Scheme functions and variables that the new Scheme API provides, and I've now done that!

If you're interested in writing extensions for gEDA, please take a look at the draft gEDA Scheme Reference Manual and let me know what you think. I'm keen to receive any feedback you might have, whether about the manual or about the API itself.

I'm hoping to merge my work so far in this branch into unstable within the next week. At the same time, I'm aiming to add the following missing functionality as soon as possible:

  • Object transformations (translation, rotation and mirroring)
  • Picture object creation and manipulation
  • Path object creation and manipulation (quite tricky, unfortunately)
  • File saving and loading
  • Picture and component embedding/unembedding

These are all things that I'm sure extension authors would be quite interested in having available to them! Hopefully we'll see all sorts of interesting new capabilities for gEDA users coming out soon.

Thursday, June 16, 2011

gEDA and Guile — opportunities to get involved

This is the eighth and last in a series of posts on extensibility in gEDA using Guile Scheme.

  1. Finding Scheme API code in gEDA
  2. Compiling against multiple Guile versions
  3. Safe handling of non-local exits
  4. Dealing with deprecated libguile functions
  5. Checking arguments to Scheme functions in C
  6. How and when to use Scheme errors
  7. Reducing boilerplate with "snarfing macros"
  8. Opportunities to get involved

In this post, I'll summarise the series. In addition, I'll recap on the opportunities to get involved in gEDA development offered by the problems I've identified with the current gEDA codebase.

In my first and second posts, I described whereabouts in the gEDA codebase to find code which uses Guile, and also how to set up a build environment to easily compile gEDA against either Guile 1.8 or Guile 2.0. Being able to test with the latest stable release of Guile is quite important, and should definitely be considered by all gEDA developers!

In my third post, I dealt with probably the most important issue: dealing with non-local execution due to Scheme exceptions and continuations. There are a lot of places in gEDA where it's assumed that a call to a libguile or Scheme function will return normally, and that's not a safe assumption. Failing to protect against that (e.g. by using the "dynamic wind" mechanism) can easily result in resources being leaked.

The following post explained how to get notified when gEDA uses deprecated Guile APIs. It also gave an example of an instance where using deprecated API long after its deprecation has lead to actual bugs — the SCM_STRING_CHARS() and SCM_SYMBOL_CHARS() macros.

I went on to talk about the use of Scheme errors. I described the use of the SCM_ASSERT() macro to check function arguments, and also how to use Scheme errors more generally. The main points to remember here were that libguile has some really useful undocumented helper functions for raising Scheme errors from C code (see libguile/error.h), and that just because SCM_ASSERT() has "assert" in its name doesn't mean that you don't have to be careful about resource management if the test fails.

Finally, I talked about the guile-snarf tool that's used by my guile-scheme-api branch to generate boilerplate code for defining Scheme functions and variables from gEDA's C code.

There are several opportunities to get involved in making gEDA better. All these jobs are good for people who aren't that familiar with gEDA development, but who want to get to learn the codebase and about the development process, because they can mostly be done just on a function-by-function basis.

  • You can help change places where SCM_STRING_CHARS() is still used, to use scm_to_utf8_string() instead. I've already had some patches from Ivan Stankovic to deal with some of these!
  • You can look at the places where gEDA calls into libguile or calls Scheme functions to make sure that any potential errors are handled properly.
  • You can double-check that all gEDA functions that take SCM arguments validate them properly using SCM_ASSERT().
  • Or you can help make sure that gEDA Scheme functions raise Scheme errors properly when things go wrong.

If you're interested in submitting patches addressing any of these, or have any questions, please get in touch.

As previously mentioned, my focus for gEDA development in the near future is going to be on getting my guile-scheme-api branch merged, and I encourage everybody to try it out and let me know what you think! For information on my various branches and how to get hold of them, please see my gEDA development page.

Wednesday, June 15, 2011

gEDA and Guile — reducing boilerplate with "snarfing macros"

This is the seventh in a series of blog posts on extensibility in gEDA using Guile Scheme.

  1. Finding Scheme API code in gEDA
  2. Compiling against multiple Guile versions
  3. Safe handling of non-local exits
  4. Dealing with deprecated libguile functions
  5. Checking arguments to Scheme functions in C
  6. How and when to use Scheme errors
  7. Reducing boilerplate with "snarfing macros"
  8. Opportunities to get involved

In this post, I will explain how you can use the guile-snarf tool to reduce the boilerplate needed to define Scheme functions and variables from C.

At the time of writing, this post isn't really relevant to the main gEDA unstable development branch, since it doesn't currently use "snarfing". It might be an interesting read if you're interested in hacking on my guile-scheme-api branch.

Why is "snarfing" needed?

One of the problems with using libguile to define Scheme functions from C is there's a certain amount of boilerplate setup code. For example, if I wish to define a function called "myfunc" in C to be usable from Scheme, I need to do something like:

static SCM
my_func (SCM arg)
{
  SCM_ASSERT (scm_is_string (arg), arg, SCM_ARG1, "my-func");

  /* ... do something useful ... */
}

/* Called during application initialisation */
void
my_init ()
{
  scm_c_define_gsubr ("my-func", 1, 0, 0, (SCM (*)()) my_func);
}

There are a few problems with this.

  • If the number of arguments to the function changes, an update needs to be made in two different places.
  • If the string name of the function changes ("myfunc" in this case), an update needs to be made in two places, and also to every other place the name is used (e.g. SCM_ASSERT() for arguments and any other place a Scheme error is raised).
  • If an extra function is added...

You get the idea.

Similarly, if there is a permanent Scheme value that's needed in multiple functions (such as a symbol) it's necessary to separately define the static variable and add a line to the initialisation function.

Finally, if the libguile API for defining a function or variable changes, it's necessary to individually update the initialisation code for every single one.

The guile-snarf tool

Guile comes with a tool called guile-snarf, which is run against a C source file to generate an additional C source file for inclusion, usually given a .x extension.

The tool recognises some special macros in the C source, which expand normally when compiled, but which are used by the tool to generate Scheme initialisation boilerplate. The example above would become:

SCM_DEFINE (my_func,   /* Function name in C */
            "my-func", /* Function name in Scheme */
            1, 0,      /* No. of required/optional args */
            0,         /* Whether accepts "rest" arg */
            (SCM arg), /* C argument list */
            "Do something exciting.") /* Docstring */
{
  SCM_ASSERT (scm_is_string (arg), arg, SCM_ARG1, s_my_func);

  /* ... do something useful ... */
}

/* Called during application initialisation */
void
my_init ()
{
  #include "mycfile.x";
}

This may seem like a small gain for this trivial example, but for files which define a large number of Scheme procedures there's a definite benefit! There are a few things to note:

  • Functions defined using SCM_DEFINE() are always static. This shouldn't normally be a problem, since they will usually be intended to be called from Scheme, not from other C functions.
  • For each function foo, SCM_DEFINE() also defines a static string called s_foo. This can be pretty handy, especially when needing to pass the Scheme name of the function to SCM_ASSERT() macros. (Its use is discouraged by the official documentation, but libguile itself makes use of it, so hey — it can't be that much of a problem!)

Although SCM_DEFINE is the main reason to use guile-snarf, there are several other really useful macros, such as SCM_SYMBOL("foo"), which creates & defines a variable containing the symbol "foo". In my guile-scheme-api branch, you will fairly commonly see a block at the top of a C source file looking like:

SCM_SYMBOL (lower_left_sym , "lower-left");
SCM_SYMBOL (middle_left_sym , "middle-left");
SCM_SYMBOL (upper_left_sym , "upper-left");
SCM_SYMBOL (lower_center_sym , "lower-center");
SCM_SYMBOL (middle_center_sym , "middle-center");
SCM_SYMBOL (upper_center_sym , "upper-center");
SCM_SYMBOL (lower_right_sym , "lower-right");
SCM_SYMBOL (middle_right_sym , "middle-right");
SCM_SYMBOL (upper_right_sym , "upper-right");

This then allows functions in the file to use the symbols to construct Scheme expressions to be evaluated without needing to recreate the symbols (e.g. using scm_from_utf8_symbol()) every time the function is run.

Conclusion

Currently, when wishing to alter the prototype a Scheme function provided by a gEDA program, you need to make changes in at least three places:

  1. The function definition itself.
  2. The function prototype in a header file.
  3. The function metadata exported to Scheme, usually in a file called g_register.c.

Using the SCM_DEFINE() "snarfing" macro and the guile-snarf tool would allow two of those places to be eliminated, and that's why my guile-scheme-api branch uses them.

The next and final post in this series will summarise the things I've written about, and give a list of easy introductory gEDA development tasks based on some of the issues I've raised.

Monday, June 13, 2011

gEDA and Guile — how and when to use Scheme errors

This is the sixth in a series of blog posts on extensibility in gEDA using Guile Scheme.

  1. Finding Scheme API code in gEDA
  2. Compiling against multiple Guile versions
  3. Safe handling of non-local exits
  4. Dealing with deprecated libguile functions
  5. Checking arguments to Scheme functions in C
  6. How and when to use Scheme errors
  7. Reducing boilerplate with "snarfing macros"
  8. Opportunities to get involved

In this post, I will try to make it clear how and when to raise Scheme errors in procedures implemented in C.

As explained in an earlier blog post in this series, Guile has a fully-fledged exception mechanism, which is used as the basis for its error reporting system. Because this allows execution to jump non-locally when an error occurs, it requires resources to be carefully managed.

Other than checking types using the SCM_ASSERT() macro, when else might it be useful to raise Scheme errors from C functions in gEDA?

When to raise Scheme errors

In Scheme functions written in Scheme, the answer is obvious: raise a Scheme error using error or scm-error. In C, it's not so simple.

Recall that there are currently four approaches to error handling in gEDA's C source code:

  1. Give up (log an error message and quit immediately).
  2. The GError mechanism from GLib.
  3. Best-effort (log an warning message and return a default value).
  4. Raise a Scheme error.

We try and avoid the "giving up" option, because quite often the situation is not actually hopeless, and the error could be handled by a function further up the stack or at the very least the user might be able to save his or her data.

In functions that are actual C API (i.e. designed primarily to be called by other C functions), Scheme errors are never appropriate, because they're quite complex to catch and handle in C. It's much easier to get from a GError to a Scheme error than the other way round! These sorts of functions should either use GError (for run-time errors outside the developers' control, such as being asked to load a file that doesn't exist), or use a "best-effort" approach (for programming errors, e.g. return NULL when asked for a list of objects in a NULL page).

On the other hand, functions that are designed primarily to be called from Scheme code should exclusively use Scheme errors. If there's a problem, either it'll be caught and handled, or it won't and an error message will be generated (and the program may exit).

This means that you shouldn't use best-effort helper macros like g_return_if_fail() — if you're called with invalid arguments, generate a Scheme error (e.g. using SCM_ASSERT()). You can apply a simple rule of thumb: if the SCM type appears in the function prototype, you should normally use Scheme error reporting.

How to raise Scheme errors

With libguile, the main function to use to raise Scheme errors is scm_error_scm(). However, the libguile headers provide some additional undocumented helper functions which make raising errors from C functions much more convenient. For example:

  • scm_error() is similar to scm_error_scm() but allows you to pass the raising function's name as a char * string rather than a Scheme string.
  • scm_misc_error() is the easiest way to generate errors for which it's not worth assigning a specific error key, and is the C equivalent to the error function in Scheme.

A full list of functions for raise errors is available in libguile/error.h, and since they are much better than scm_error_scm(), use them.

Don't forget that raising a Scheme error causes a non-local exit, and make provisions (such as using dynamic wind) to release any resources you hold correctly.

Conclusion

Guile provides a full-featured error-reporting system, which gEDA applications should use in their Scheme APIs more than they currently do. The best way to raise a Scheme error isn't via scm_error_scm(), but through the undocumented helper functions that libguile provides (and I should probably submit a documentation patch to upstream Guile to fix that...)

In my next post, I'll talk about the guile-snarf tool, and how to use Guile's "snarfing macros" to simplify the boilerplate needed to export C functions as callable from Scheme.

Sunday, June 12, 2011

gEDA and Guile — checking arguments to Scheme functions in C

This is the fifth in a series of blog posts on extensibility in gEDA using Guile Scheme.

  1. Finding Scheme API code in gEDA
  2. Compiling against multiple Guile versions
  3. Safe handling of non-local exits
  4. Dealing with deprecated libguile functions
  5. Checking arguments to Scheme functions in C
  6. How and when to use Scheme errors
  7. Reducing boilerplate with "snarfing macros"
  8. Opportunities to get involved

In this post I'll explain how to use the SCM_ASSERT() macro to check arguments passed to Guile functions written in C using libguile.

Using SCM_ASSERT()

SCM_ASSERT() is a shorthand way to raise a Scheme wrong-type-arg error for a Scheme variable if a test fails. For example, if there is a function which takes a string as an argument, you might write it as:

SCM
myfunc (SCM str)
{
  SCM_ASSERT (scm_is_string (str), str, SCM_ARG1, "my-func");

  /* ... do things with str ... */
}

All functions in gEDA which are exposed as Scheme procedures must use SCM_ASSERT() to thoroughly check the types of their arguments.

There are a few things to note:

  • The first parameter to SCM_ASSERT() is a test, and should evaluate to an integer, not a Scheme value. If the test is zero, an error is signalled.
  • If you need to use a test which evaluates to a Scheme value (e.g. scm_list_p()), don't compare it to SCM_BOOL_F or SCM_BOOL_T directly; use scm_is_true() or scm_is_false().
  • The second parameter is the variable which has the wrong type. Although the test might not be on it directly (e.g. if you are checking for a list of strings), the second parameter must always be the actual argument which was passed to myfunc.
  • The third parameter to SCM_ASSERT() indicates which argument to myfunc had the wrong type — and remember that this is 1-indexed. Although this parameter is just an integer, you should always use the SCM_ARG1, SCM_ARG2 etc. macros for arguments up to 7, since this makes the purpose of the parameter more readable.
  • If you have a function that takes an arbitrary number of parameters, you should use the SCM_ARGn macro as the argument index.
  • The final parameter is a char * string containing the name of the function as seen by Scheme ("my-func" in this case).

If you have a C helper function which is used by many similar Scheme functions, you can generate more useful error messages if you pass it the name of the function that's actually visible to Scheme code. For example:

void
helperfunc (SCM str, const char *funcname)
{
  SCM_ASSERT (scm_is_string (str), str, SCM_ARGn, funcname);

  /* ... do things ... */
}

void
myfunc (SCM str)
{
  helperfunc (str, "my-func");

  /* ... do more things ... */
}

Don't leak resources!

The most important thing to remember is that SCM_ASSERT() does not behave like most normal assertion mechanisms when programming in C (g_assert() in GLib for example), in that it doesn't end the program if it fails — it simply causes a non-local exit by raising an error.

Because SCM_ASSERT() might not return, it's really important to be careful about managing resources when writing a Scheme function in C. For example, consider a function that needs to convert a Scheme list of strings into a GList of C strings. Here's a naive, broken implementation:

/* Don't do this -- it leaks memory */
void
process_string_list (SCM lst)
{
  GList *glst = NULL;
  SCM_ASSERT (scm_is_true (scm_list_p (lst))), lst, SCM_ARG1,
              "process-string-list");

  for (SCM iter = lst; iter != SCM_EOL; iter = SCM_CDR (iter)) {
    SCM str = SCM_CAR (iter);
    SCM_ASSERT (scm_is_string (str)), lst, SCM_ARG1, "process-string-list");
    glst = g_list_prepend (glst, scm_to_utf8_string (str));
  }

  /* ... do something with glst ... */
}

If the SCM_ASSERT() inside the loop raises an error, the stack will unwind out of process_string_list() and the memory allocated in glst will be leaked. Instead, you could check the input, and then build the list:

void
process_string_list (SCM lst)
{
  GList *glst = NULL;
  SCM_ASSERT (scm_is_true (scm_list_p (lst))), lst, SCM_ARG1,
              "process-string-list");

  /* Check the input is actually a list of strings */
  for (SCM iter = lst; iter != SCM_EOL; iter = SCM_CDR (iter)) {
    SCM_ASSERT (scm_is_string (SCM_CAR (iter)), lst, SCM_ARG1, "process-string-list");
  }

  /* Convert to a GList */
  for (SCM iter = lst; iter != SCM_EOL; iter = SCM_CDR (lst)) {
    glst = g_list_prepend (glst, scm_to_utf8_string (SCM_CAR (iter)));
  }

  /* ... do something with glst ... */
}

The problem with this is that it is inefficient, as it requires iterating over the linked list twice. A more efficient approach would be to use dynamic wind, as I discussed in a previous post in this series.

/* A function for freeing a GList containing strings */
static void
free_string_glist (void *data)
{
  GList *glst = *((GList *) data);
  for (GList *iter = glst; iter != NULL; iter = g_list_next (iter)) {
    free (iter->data);
  }
  g_list_free (glst);
}

void
process_string_list (SCM lst)
{
  GList *glst = NULL;
  SCM_ASSERT (scm_is_true (scm_list_p (lst))), lst, SCM_ARG1,
              "process-string-list");

  /* Begin a dynamic extent, and register a handler to clean up on error */
  scm_dynwind_begin (0);
  scm_dynwind_unwind_handler (free_string_glist, (void *) &glst, 0);

  /* Convert to a GList, checking types */
  for (SCM iter = lst; iter != SCM_EOL; iter = SCM_CDR (iter)) {
    SCM str = SCM_CAR (iter);
    SCM_ASSERT (scm_is_string (str)), lst, SCM_ARG1, "process-string-list");
    glst = g_list_prepend (glst, scm_to_utf8_string (str));
  }

  /* End dynamic extent */
  scm_dynwind_end ();

  /* ... do something with glst ... */
}

Update: Thanks to Ivan Stankovic for pointing out a bug in this example!

Unfortunately, there appear to be quite a few places in gEDA where there is the potential for resource leakage like this to occur, and checking that all functions which use SCM_ASSERT() do so in a safe manner would be a really useful thing to do.

Conclusion

As you can tell, even doing something as conceptually simple as checking your arguments safely can be a tricky thing to get right when using libguile — another argument, if you needed one, for keeping the number of functions in gEDA's Scheme API as small as possible while still exposing all of the necessary functionality.

In my next post, I'll discuss how (and when!) in general to throw Scheme errors from C.

Saturday, June 11, 2011

gEDA and Guile — dealing with deprecated libguile functions

This is the fourth in a series of blog posts on extensibility in gEDA using Guile Scheme.

  1. Finding Scheme API code in gEDA
  2. Compiling against multiple Guile versions
  3. Safe handling of non-local exits
  4. Dealing with deprecated libguile functions
  5. Checking arguments to Scheme functions in C
  6. How and when to use Scheme errors
  7. Reducing boilerplate with "snarfing macros"
  8. Opportunities to get involved

In this post I'll explain how to set up your environment to get warnings about use of deprecated libguile API, and some of the most problematic deprecated API usage in gEDA.

Getting notified about deprecated API usage

Most gEDA applications use something like the following to disable deprecation warnings unless you specifically set your environment to request them:

if (getenv ("GUILE_WARN_DEPRECATED") == NULL)
  putenv ("GUILE_WARN_DEPRECATED=no");

The simplest way to enable deprecation warnings is, therefore, to add something like this to your shell rc file:

GUILE_WARN_DEPRECATED="detailed"
export GUILE_WARN_DEPRECATED

Now when you start gschem, you'll get a few warnings on stderr (depending on which version of Guile you've compiled against). Here's an example of what you see when starting gschem if you compile against Guile 2.0:

$ gschem
`(debug-enable 'debug)' is obsolete and has no effect. (1)
Remove it from your code.
SCM_STRING_CHARS is deprecated. See the manual for alternatives. (2)
SCM_SYMBOL_CHARS is deprecated. Use scm_symbol_to_string. (3)

Some of this is a problem, some of it isn't.

  1. We use (debug-enable 'debug) in system-gafrc to enable backtraces in Guile 1.8, but it does nothing in Guile 2.0. It would be nice to find a way to only call it if necessary, but it's not really a priority.
  2. SCM_STRING_CHARS() is used a lot in gEDA to obtain a pointer to the underlying character buffer of a Scheme string, and it's been deprecated since Guile 1.8.0. gEDA's reliance on this is a bit of a problem, unfortunately, and I'll explain why later.
  3. SCM_SYMBOL_CHARS() is similar to SCM_STRING_CHARS() (but for the string representation of a Scheme symbol), and has also been deprecated for a long time.

The problem with SCM_STRING_CHARS() and SCM_SYMBOL_CHARS()

So, why is using SCM_STRING_CHARS() a problem? The main issue is that its use assumes that Guile's internal string representation is the same as gEDA's, i.e. an array of char, and that Guile's internal string encoding is the same as gEDA's, i.e. UTF-8. Neither of these assumptions are reliable.

There are two different internal representations for strings in Guile 2.0. All strings are stored as an array of Unicode code points.

  • If all the code points are in the range 0-255 inclusive, the code points are stored with one byte per code point, i.e. as Latin-1 or ISO-9959-1. This is not UTF-8.
  • If any of the code points is outside that range, the whole string is stored with four bytes per code point, i.e. as UTF-32. This is also not UTF-8.

Additionally, Guile 2.0 introduces read-only strings (which don't work with SCM_STRING_CHARS()) and shared substrings (which don't work with SCM_STRING_CHARS()).

So how can we break a function func that takes a single string argument and uses SCM_STRING_CHARS()? Let me count the ways.

  1. We can pass it a string containing code points above 255. Since we target a worldwide user base these days, that's not particularly unlikely.

    (func "你好")
  2. We can pass it a shared substring.

    (func (substring/shared "foo bar" 0 3))
  3. We can pass it a read-only substring

    (func (substring/read-only "foo bar" 0 3))

SCM_SYMBOL_CHARS() shares all these problems. Both these macros have been deprecated since Guile 1.8.0 exactly because the Guile developers wanted to be able to change Guile's original internal string representation to support Unicode fully. (In case you're wondering why they don't use UTF-8, it's because Scheme requires a bunch of string operators that operate on the nth character in a string, and using UTF-8 would make those operators much slower).

Replacing SCM_STRING_CHARS()

What's the alternative? Ideally, we'd use the rather handy scm_to_utf8_stringn() function, but that was only introduced in Guile 2.0 (along with Unicode support), so it's not an option. Instead, we have to rely on scm_to_locale_string(). The main difference between the new functions and SCM_STRING_CHARS() is that the new functions allocate memory, which must be freed with free() (n.b. not g_free()).

Update: gEDA now provides scm_to_utf8_string() and scm_from_utf8_string() even if Guile doesn't, so always use them unless you actually want to work with locale-encoded strings.

So suppose we started off with a version of myfunc() that uses SCM_STRING_CHARS():

void
myfunc (SCM arg)
{
  /* N.b. we should check that arg is in fact a string */
  printf ("%s", SCM_STRING_CHARS (arg));
}

It should be replaced by:

void
myfunc (SCM arg)
{
  char *arg_str;
  /* N.b. we should check that arg is in fact a string */
  arg_str = scm_to_utf8_string (arg);
  printf ("%s", arg_str);
  free (arg_str);
}

In reality, you'll want to do something more complicated than just print the string. Don't forget that if you do more than trivial calls into libguile in between creating arg_str and freeing it you should probably use dynamic wind to make sure that it is properly cleaned up.

You should also be aware that because scm_to_utf8_string() tries to return a null-terminated string, it throws an error if the string contains #\nul characters. It can throw an error if the string can't be converted to the requested encoding (which is locale-dependent for scm_to_locale_string()). This introduces its own challenges.

Replacing SCM_SYMBOL_CHARS()

Replacing SCM_SYMBOL_CHARS() is similar to replacing SCM_STRING_CHARS(); simply use scm_symbol_to_string() to convert the symbol to a string, and then use scm_to_utf8_string() as before.

Conclusion

Library APIs are rarely deprecated without a good reason, and being aware of and proactive about updating deprecated API usage can help avoid some serious problems. Updating gEDA to remove the use of the SCM_STRING_CHARS() and SCM_SYMBOL_CHARS() macros is an important job. Nevertheless, it would still be quite accessible for someone less familiar with the gEDA code base, as it can be done by dealing with one function at a time.

In my next post, I will describe how to use SCM_ASSERT to check types of SCM arguments to functions that use libguile.

Friday, June 10, 2011

gEDA and Guile — safe handling of non-local exits

This is the third in a series of blog posts on extensibility in gEDA using Guile Scheme.

  1. Finding Scheme API code in gEDA
  2. Compiling against multiple Guile versions
  3. Safe handling of non-local exits
  4. Dealing with deprecated libguile functions
  5. Checking arguments to Scheme functions in C
  6. How and when to use Scheme errors
  7. Reducing boilerplate with "snarfing macros"
  8. Opportunities to get involved

In this post I'll be talking about an issue which the current gEDA codebase doesn't deal with very well: dealing with non-local exits in C code that uses libguile.

What are non-local exits?

Non-local exits are a way for execution to jump out of the current execution context to another point in the program. The only way of doing this in C is through the quite limited setjmp/longjmp mechanism .Guile supports two fully-featured constructs for non-local control flow: exceptions and continuations.

Exceptions will be fairly familiar to many developers, as they are heavily used in languages such as Java and Python. When an exception is thrown, the stack is unwound until an exception handler is found for the exception. If none is found, then the program will usually exit with an uncaught exception error. Guile uses exceptions to implement its error handling mechanism, and you can read about that in more detail in the Guile manual (Exceptions, Error Reporting).

The other way that a non-local exit might occur is via a continuation. These are pretty hard to get your head around, unfortunately. The Guile manual describes them as follows:

A “continuation” is the code that will execute when a given function or expression returns. For example, consider

(define (foo)
  (display "hello\n")
  (display (bar)) (newline)
  (exit))

The continuation from the call to bar comprises a display of the value returned, a newline and an exit. This can be expressed as a function of one argument.

(lambda (r)
  (display r) (newline)
  (exit))

In Scheme, continuations are represented as special procedures just like this. The special property is that when a continuation is called it abandons the current program location and jumps directly to that represented by the continuation.

A continuation is like a dynamic label, capturing at run-time a point in program execution, including all the nested calls that have lead to it (or rather the code that will execute when those calls return).

Guile implements continuations that are usable from C code with some pretty crazy techniques for saving and restoring the C stack.

The important point to take away from this is that when you call a Scheme function from C, you cannot depend on it returning normally. That means that totally legitimate resource management in a normal C/GLib program, such as:

void myfunc()
{
  gchar *buf = g_strdup ("const string");
  anotherfunc (buf);
  g_free (buf);
}

becomes unsafe if anotherfunc() calls Scheme code that might either call a continuation or throw an exception, since in that case the call to g_free() would never occur and the memory assigned to buf will be leaked.

In gEDA there are several places which do exactly this, and they need fixing.

Dealing with non-local exits safely

Fortunately, Guile provides two mechanisms for dealing with this, called Continuation Barriers and Dynamic Wind.

Continuation barriers are the less powerful of the two mechanisms, but are simpler to understand. They simply block both exceptions and continuations from entering or leaving the context of the barrier call. Here is how to modify the previous example to use a continuation barrier:

void myfunc()
{
  gchar *buf = g_strdup ("const string");
  scm_c_with_continuation_barrier (anotherfunc, buf);
  g_free (buf);
}

There's an obvious disadvantage to this approach: you can't catch and handle exceptions which occur, or allow exceptions to propagate upward to the function that called myfunc() in case they can be caught and handled there!

Dynamic wind provides a more flexible method. The scm_dynwind_start() and scm_dynwind_end() functions delimit a dynamic extent, and it's possible to register actions to be carried out whenever the dynamic extent is entered or left (e.g. due to a continuation or exception). It's also possible to indicate that non-local exits from the dynamic extent are permitted, but non-local entries aren't (this is the usual thing to do when using dynamic wind from C code).

Here is myfunc() modified to use dynamic wind:

void myfunc()
{
  /* Begin a dynamic extent that can't be re-entered */
  scm_dynwind_begin (0);

  gchar *buf = g_strdup ("const string");

  /* Make sure that buf is freed when the dynamic extent is
   * left either locally or non-locally */
  scm_dynwind_unwind_handler (g_free, buf, SCM_F_WIND_EXPLICITLY);

  anotherfunc (buf);

  /* End dynamic extent */
  scm_dynwind_end ();
}

This ensures that buf is always freed. If anotherfunc() raises an exception, or it calls a continuation that jumps out of myfunc(), the unwind handler will make sure that the string is freed. Since the SCM_F_WIND_EXPLICITLY flag was passed to scm_dynwind_unwind_handler(), the handler will also be called if the call to scm_dynwind_end() is reached and the function returns normally.

Conclusion

One task that's well past due is a review of functions in gEDA that call into Scheme to make sure that they safely handle non-local exits, either by using continuation barriers, dynamic wind, or the "protected" functions provided in libgeda (see libgeda/src/g_basic.c). This would be ideal for someone less familiar with the code base to tackle, since it can be approached on a function-by-function basis.

In my next post, I will describe how to set up your environment to get notifications about use of deprecated libguile API usage in gEDA, and in particular the problems caused by the continued use of the long-deprecated SCM_STRING_CHARS macro.

Thursday, June 09, 2011

gEDA and Guile — compiling against multiple Guile versions

This is the second in a series of blog posts on extensibility in gEDA using Guile Scheme.

  1. Finding Scheme API code in gEDA
  2. Compiling against multiple Guile versions
  3. Safe handling of non-local exits
  4. Dealing with deprecated libguile functions
  5. Checking arguments to Scheme functions in C
  6. How and when to use Scheme errors
  7. Reducing boilerplate with "snarfing macros"
  8. Opportunities to get involved

In this post, I'll explain how to set up your build environment in such a way that you can easily compile gEDA against either Guile 1.8 or Guile 2.0, since we need to support using gEDA with either version of Guile at the moment.

Firstly, you need to build and install Guile 2.0 and Guile 1.8 to their own prefixes, e.g. /opt/guile-2.0/ and /opt/guile-1.8/.

$ wget ftp://ftp.gnu.org/gnu/guile/guile-2.0.1.tar.gz
$ tar -zxf guile-2.0.1.tar.gz
$ cd guile-2.0.1
$ ./configure --prefix=/opt/guile-2.0
$ make
$ sudo make install

$ wget ftp://ftp.gnu.org/gnu/guile/guile-1.8.8.tar.gz
$ tar -zxf guile-1.8.8.tar.gz
$ cd guile-1.8.8
$ ./configure --prefix=/opt/guile-1.8
$ make
$ sudo make install

Next, you need to have some way of quickly adding and removing the different versions of Guile to your environment. Specifically, you need to be able to modify INFOPATH, PATH, MANPATH, LD_LIBRARY_PATH and PKG_CONFIG_PATH.

The approach I use is to use Environment Modules (henceforth referred to as modules). The idea is to be able to do something like this:

$ module load guile/1.8
$ which guile
/opt/guile-1.8/bin/guile
$ module switch guile guile/2.0
$ which guile
/opt/guile-2.0/bin/guile

One particularly neat thing about modules is that although I give all my examples here for POSIX sh and/or bash, the same module files can be used with other shells such as tcsh with no modification.

Installing & setting up Environment Modules

Unfortunately, modules can be a bit of a pain to install, and many distributions don't appear to package it. This is a guide to getting it working on Ubuntu 10.04; it's much easier on Fedora, since there you can just install the environment-modules package.

Due to a really obnoxious default file layout that's not at all trivial to fix, it's easiest to install modules to its own prefix, e.g. /opt/Modules. You'll need to do something like this (after downloading the source code):

$ tar -zxf modules-3.2.8a.tar.gz
$ ./configure --disable-versioning --prefix=/opt
$ make
$ sudo make install

It's then necessary to set up your ~/.bash_profile and ~/.bashrc to enable modules for your sessions. Firstly, you need to add some boilerplate to the top of ~/.bash_profile:

if [ -f /opt/Modules/init/bash ]
then
# Set the file to store your initial environment in
export MODULESBEGINENV=$HOME/.modules/beginenv
# Initialise environment for modules
. /opt/Modules/init/bash
# Set a directory for your personal module files (create this directory)
module use $HOME/.modules
# Load some default modules
module load guile
fi

It's also useful add this to your ~/.bashrc, to make sure that when you start a new shell your environment is preserved:

if [ ! -z "$MODULESHOME" ]; then
module() { eval `$MODULESHOME/bin/modulecmd bash $*`; }

module update # Need to reload modules because of setuid utmp
fi

Creating modules for Guile versions

The next step is to make the module files for the two versions of Guile. These will be called /opt/Modules/modulefiles/guile/1.8 and /opt/Modules/modulefiles/guile/2.0, and will look like this:

#%Module1.0
prepend-path PATH /scratch/opt/guile-2.0/bin
prepend-path LD_LIBRARY_PATH /scratch/opt/guile-2.0/lib
prepend-path PKG_CONFIG_PATH /scratch/opt/guile-2.0/lib/pkgconfig
prepend-path INFOPATH /scratch/opt/guile-2.0/share/info
prepend-path MANPATH /scratch/opt/guile-2.0/share/man

with 2.0 replaced by 1.8 for /opt/Modules/modulefiles/guile/1.8.

Finally, it's necessary to tell modules which version of Guile should be the default, for when a user asks for module load guile. We can do this by adding a /opt/Modules/modulefiles/guile/.modulerc file.

#%Module1.0
set ModulesVersion 2.0

Putting it all together

After re-initialising your session (probably easiest to log out and back in again), you should be able to swap between versions of Guile using module load and module switch as described above.

You should now be easily able to compile gEDA against either version of Guile without too much difficulty — simply load the appropriate environment module, and then run:

$ ./config.status --recheck && ./config.status && make

In my next post I will discuss one of the problems that's quite common in gEDA's use of Guile: unsafe behaviour on non-local exits such as Scheme exceptions.

Wednesday, June 08, 2011

gEDA and Guile — finding Scheme API code in gEDA

The response to my previous post was overwhelmingly in favour of continuing and merging my work on gEDA extensibility using Guile Scheme, both here and on the gEDA mailing lists. Some people asked how they could help out, and so I'm going to write a series of blog posts with some information on how gEDA uses Scheme and some easy introductory tasks that would be useful.

  1. Finding Scheme API code in gEDA
  2. Compiling against multiple Guile versions
  3. Safe handling of non-local exits
  4. Dealing with deprecated libguile functions
  5. Checking arguments to Scheme functions in C
  6. How and when to use Scheme errors
  7. Reducing boilerplate with "snarfing macros"
  8. Opportunities to get involved

In this blog post, I'll explain where Guile is used in gEDA at the moment, both C code using libguile and Scheme code itself.

What is Scheme used for?

Currently, Guile is used for:

  • Executing "configuration" files (actually initialisation files) written in Scheme, in all libgeda applications.
  • Setting up the menus and keybindings in gschem. Key sequences and menu items call Scheme "thunks" (functions of no arguments), which are mostly implemented in C. It's not currently feasible to create a meaningful new action in gschem without writing it mostly in C.
  • Exporting netlists in gnetlist. A gnetlist backend generates particular on-disk netlist file format using a limited API that provides access to the compiled netlist.
  • A few other things (mostly incomplete). For example, there is a very limited existing Scheme API spread between gschem and libgeda for taking actions on changes to pages and/or attributes.

It would be nice to do more things (like extending gschem with additional actions written in pure Scheme, or being able to modify the way that gnetlist compiles a netlist from input schematics).

Where is the Guile code in gEDA?

In the C sources, files which contain code using libguile usually begin with the prefix "g_". For example, "g_rc.c" contains definitions of functions used for configuration parameters, and "g_register.c" contains code for registering C functions and variables to be visible from Scheme. There are a bunch of exceptions to be aware of, which include:

  • libgeda/src/s_clib.c contains some libguile code that permits component libraries to be defined as a set of Scheme functions.
  • libgeda/src/s_menu.c contains some basic infrastructure for menu definitions in Scheme rc files that's past due to move into gschem.
  • libgeda/src/s_menu.c has functions for converting gEDA colour map data to and from Scheme representations.
  • Several gschem source files contain snippets of libguile code for firing hooks when the user carries out certain actions.
  • gschem/src/x_menus.c implements menus defined in Scheme, and gschem/src/x_stroke.c provides support for assigning gestures to actions in Scheme (which capability I'm not sure I've ever used).

More generally, you can usually find code that uses libguile by searching for the "scm_" or "SCM_" prefix used by libguile functions, macros and variables.

Most of the application directories in the gEDA source tree have a "scheme" subdirectory where most of the ".scm" files with Scheme code live. There are also the system rc files for each application, which are also written in Scheme but for some reason live in "lib" subdirectories.

So that's a brief overview of where the Scheme-related code lives in gEDA. In my next post, I will talk about how to set up your build environment to be able to easily build against either Guile 1.8 or Guile 2.0.

Sunday, June 05, 2011

Porting gEDA to Guile 2.0, and future plans

Guile, a Scheme implementation that's part of the GNU project, recently put out a new 2.0 release with many exciting new features and improvements.

gEDA relies on Guile for extensibility. In the past we have had a few problems moving to new versions of Guile, often due to differences between memory management APIs or library functions.

I first started looking at Guile 2.0 support in gEDA in December 2010. At the time, I had great difficulty getting libgeda to work due to differences in the debugging APIs between Guile 1.8 and 2.0. However, during January I did a considerable amount of work on how Scheme errors are reported in libgeda, and as part of that I managed to do things in a way compatible with both versions of Guile.

Last Friday I was therefore able to get gEDA fully working with Guile 2.0. Actually, the main remaining obstacle was that the new version of Guile is much stricter about correct Scheme syntax, and some of the gnetlist backends needed fixing up.

But that's done, and gEDA is now fully compatible with Guile 2.0 and can take advantage of the performance improvements it brings!

What's next for me in terms of gEDA development? Well, I still have several different development branches on the go, and I'm not sure which of them (if any) to focus on.

  1. I have a branch which extensively refactors, extends and improves the libgeda scheme API, with the intention of making it possible to write much more fully-featured plugins & extensions for gschem and gnetlist.
  2. I have done a lot of work on splitting the cairo rendering code out of gschem into a separate library called libgedacairo, in order to make it easier to write applications that display gEDA schematics or convert them to other graphics formats. (As part of of this work, I'm also modifying gschem to take advantage of the GTK+ printing APIs).
  3. I have a design for a new configuration system for gEDA, which is non-executable and can provide per-schematic-page configuration. This is intended to remove a security flaw in gEDA at the moment when opening designs received from others.

Or are there other, more urgent things that need work? let me know what you think.