Peter Brett's Blog

Tuesday, December 25, 2007

Christmas

Having been horribly ill for what feels like most of December, Christmas is now here! I'm spending Christmas with my parents and all of my siblings, and it seems like I've been running around like a mad things for the last few days getting everything clean, tidy and ready for Christmas. On Christmas Eve my mother and I were at the butchers' shop at half past six in the morning to get the turkey and other bits we needed! This was most certainly a better option than waiting in the hour-long queue that had formed by the time I returned to the town centre later in the morning to do some other shopping.

Although being miserably ill put a big damper on my plans to do vast amounts of gEDA development during December, I've nevertheless managed to get some useful stuff done, further based on the libgeda error reporting issues discussed in my last couple of posts.

Firstly, I improved the way in which errors in Scheme files were reported. Messages reporting errors in gEDA rc files often reported the error as occurring in the wrong file, and claimed every error was a parenthesis mismatch (often when they were nothing of the sort). The error messages were also sent directly to the console rather than to the gEDA logging mechanism. Using deep dark Guile debugging magic, any errors encountered while reading in rc files now logs a backtrace, and tries to report the actual file and line on which the error occurred.

Secondly, and more importantly, I added internationalisation support to libgeda using gettext. As there are many international users of gEDA, and many important messages generated by libgeda, I feel that this is a very important development.

I'm hoping that we can use the web translation tools offered by Launchpad to make translating gEDA easier for users, and thus increase the quality and number of translations.

Peter Clifton and I have also been working hard on getting gEDA ready for a new release, squishing as many serious bugs as possible. Peter Clifton has particularly excelled himself, creating stunning icons for all the main file types used by gEDA.

In the next development cycle, I hope to substantially increase the flexibility of the component library. In particular, I hope to make it possible to have truly independent library setups, so that it is possible to have schematics from different projects with different libraries open in the same gschem session without any conflicts. I also hope to generalise the library infrastructure so that the same basic code can be used for the component library, hierarchy source library and any other library of reusable resources which might need to be added in the future.

As a first step along the route to the Ultimate Resource Library, however, the way that the Scheme extension API works needs to be rationalised, and once Christmas starts to wind down, I hope to deal with this. It will involve fluids.

Update: My blood test results have arrived, and it turns out that I have glandular fever. Absolutely fantastic.

Friday, December 07, 2007

Error-reporting progress

Further to my previous blog post on error-reporting in libgeda, I've made some pretty good progress on implementing the ideas that I described. In particular:

The file opening functions (f_open() and f_open_flags) now use GError to return error messages rather than just spewing them to the console.
gschem now shows an informative dialog box when it fails to load a file.
All programs now actually tell you the real reason for failing to load a file (e.g. "Permission denied" or "File does not exist") rather than just giving a generic "Could not open file" message.
The libgeda logging mechanism now handles most levels of GLib log message rather than just the "message" level (seeing as the log is most useful when some sort of error has occurred, this seems sensible).
gschem now highlights warnings and errors in the log window.
Lots of ways in which libgeda could randomly kill your program have been eradicated.

These changes are not yet in the unstable branch, but barring someone objecting to my changes I'll push them upstream tomorrow sometime.

The next (probably thankless) task is to work through libgeda converting all calls to directly output debugging/error-reporting information to the console to use the GLib logging functions instead, and to make log messages be sent at appropriate levels rather than all at the "message" level (this will allow filtering of log messages in the gschem log window, for instance).

I'm still of two minds about the best way to do this. Currently, libgeda defines a macro s_log_message() which is equivalent to GLib's g_message(). For adding logging at multiple levels, the two options are:

Get rid of s_log_message entirely, and just use the GLib logging macros directly.
Define s_log_debug, s_log_warning etc so that the exact implementation could be easily modified at a later date (e.g. for the purpose of making libgeda always log in the "libgeda" log domain).

I'm currently leaning towards the latter, but haven't as yet made up my mind about exactly which will be the best way to go.

Tuesday, November 27, 2007

Error reporting & handling in libgeda

Apparently some people do read my blog: Peter Clifton and I met up for a drink at Borders in town this evening, and he recommended that I write more often. So here's this evening's contribution, on a slight refactor of libgeda error reporting.

It's been a while since I last blogged about gEDA development, and since then I've had the good fortune to be able to travel to Cambridge, MA and meet several of the other developers based around there. In the last couple of days I've started ramping up my involvement in preparation for having some time for serious development over the Christmas vacation.

So, to begin: in general, libgeda kills the process without warning far too often. This is a bad thing for a library to do; in particular, it's very bad to do it just because someone passed some bad data in. The libgeda doesn't make any attempt to hide how the data structures are organised, and thus we have to assume that user code will make use of this knowledge to try and hack them directly.

My pondering of this problem over the last couple of days has lead me to think up four basic rules to consider when working out how to handle errors occuring in libgeda code:

If possible, succeed.
If failure is inevitable, fail gracefully.
If normal operation may result in failure, use GError.
Assume that libgeda works.

Of course, some of these need a little explanation as to their interpretation and why they make sense.

Firstly, "If possible, succeed." This seems obvious, but actually has some subtlety. What I mean here is that if there's a sensible, clean way of carrying on despite a problem, you should do so. This only really applies to user-facing code, as code which can't be called from outside libgeda really should have had its inputs checked by the calling function already. However, since all of libgeda's code is user-facing -- we don't have any private headers or the like -- this point is rather moot. In addition, "succeeding" doesn't preclude printing messages to the effect of, "Someone's playing silly buggers but I'm going to try my best anyway." g_critical() should be used for this. One example would be when an unknown object type is encountered; at the moment, libgeda often kills the program by calling g_assert_not_reached(), when often it would be valid to continue by logging a critical message and then skipping over the offending object.

Secondly, "If failure is inevitable, fail gracefully." Failing gracefully requires no dangerous side-effects from the failure. If possible, the system should be returned to its prior state. This often requires that user data needs to be checked before doing anything destructive, possibly at the expense of some CPU time.

GError is a nice mechanism in that it allows errors to be ignored if necessary. A good example of when it should be used would be in code which reads and writes files; because GLib's file access code already uses GError extensively, this would not be hard to implement.

Finally, "Assume that libgeda works." Similarly to rule #1, what is meant here is that libgeda functions should check their own behaviour. If they do so, there is no need for libgeda functions which call them to check again -- it's a waste of CPU time and developer effort.

So, given the above points, when is it appropriate to use g_assert()? It would not be appropriate to use it to check the arguments passed to a function, but it would be valid to use it to check that the function has successfully done what it is supposed to before returning a result. For instance, when a complex algorithm is in use, putting some assertions in to make sure that the algorithm actually does what you think it does might be a very good idea.

Of course, this has lead to a number of action items in terms of libgeda refactoring:

Cleaning up uses of g_assert() in libgeda, and replacing them with g_critical() logging and graceful failure where possible. I've already made a start on this.
Moving all of the file handling code to use GError. This will allow gschem to show a message dialog when a file operation fails, rather than forcing the user to look at the log window or at the console.
In the medium term, moving all of libgeda to use the GLib logging API (of which g_critical() is part) will allow the development of a shiny new gschem log window which allows logging at different levels with colour coding of criticality. This would be nice.
Ideally, the error messages used for GErrors emitted from libgeda should be translated. This means libgeda will need to use gettext.

As usual, I'd be interested to hear peoples' thoughts on this. Please let me know on the -dev list or by e-mail.

Monday, July 23, 2007

Taking time out

Much of my work on gEDA recently has met with rather mixed responses, many not particularly positive. Most of the things that bug me are fixed, so between the lukewarm reception that seems to meet my changes and the absence of any particular itch to scratch, I'm taking a little time out from gEDA hacking. This will also (hopefully) help with my RSI, which is slightly more aggravating than usual at the moment. No doubt I'll still troll the mailing lists telling users to RTFM (or possibly write it).

My current contract is both interesting and hard; I'm doing an electronic systems redesign for a laboratory automation robot to be used by the Sanger Institute, at the Wellcome Trust site near Cambridge. This is rather involved, with a very broad scope for the work. I've had to learn a lot different types of electric motor (in particular stepper motors and brushless DC motors), as well as needing to keep a very careful eye on issues such as inductive load dump and EMI filtering. Sometimes digging around to find out the design rationale for parts of the current design can get a little tedious; I hope that my attempts to put everything in one place will be useful for future engineers working on the same system!

One thing that has recently struck me while working on this project is the lack of a consistent lightweight symbol library for gEDA suitable for drawing circuit diagrams to be used in documents such as datasheets or academic papers. I've made a rough start on creating one, but have got a little stuck on exactly what symbols are within the scope; and more stuck on finding a sensible way to name all of the different types of diode available.

Saturday, June 30, 2007

Making embedding more useful

One of the trickiest problems electronic engineers have is with keeping their symbol libraries up-to-date while not breaking old schematics (e.g. with changing pin positions).

Currently, most people use one of two approaches to managing the problem:

gEDA has the ability to embed symbol data in the schematic file, making the schematic independent of changing symbol libraries, and some people choose to embed all their symbols. However, a full copy of the symbol is embedded for each time it's instantiated in the schematic, making the file size bloat enormously. In addition, not only is neither possible to check whether any of the symbols have updated versions available in the library nor to update all symbols of a given type together, it is also impossible to make sure that all symbols of the same type are embedded from the same version of that symbol.
gEDA also has the ability to use a project-specific configuration file which lives in the same directory as the schematics. This can be used to implement a project-specific component library where all symbols used by the project live. Although this solves the problem of keeping all symbols in the project in sync with each other, it is quite high maintenance to keep the project library in sync with a "master" library. It is also still impossible to automatically check for newer versions and update them.

My solution to this would be to add the missing features to the embedding functionality (presupposing that these embedding bugs [1692626] are also fixed).

By default, each symbol used would automatically have a single copy embedded in the schematic, which instances of that symbol would reference. This would mean that all instances of a given symbol would be upgraded in lockstep, eliminating inconsistencies.

A hash algorithm would be used to compare the embedded version of a symbol with the latest library version. If they differ, an icon would be shown in a new component manager (derived from the current component selector) and users would be able to update symbols from there, either all symbols at once or one symbol at a time.

I have yet to work out whether it should be possible to edit symbols which do not exist in the local library, but it should certainly be possible to open them for read-only viewing or export them to discrete symbol files.

I believe this solution would eliminate most of the current deficiencies of gEDA and gschem for keeping archived schematics safe and keeping current schematics in sync with libraries.

Terminal emulators, utmp and setgid

Something that's been really irritating me recently on major Linux distributions is the increasing proportion of user software being installed setgid.

This is usually for security reasons; for instance, to either protect system services from accidental user interference, or to protect the user's personal information from being sniffed by other software they may be running.

Unfortunately, when a setgid executable is run, the environment is "cleaned"; potentially dangerous environment variables like TMPDIR and LD_LIBRARY_PATH get unset (this stops library preload attacks, for example).

But what if you want to have LD_LIBRARY_PATH set, for legitimate reasons?

The first set of breakage that bugged me was in xinit, where the call to start the window manager (which occurs after the user's login scripts are run) was wrapped in a call to SSH_AGENT. This effectively meant that it was not possible to set LD_LIBRARY_PATH for your window manager session at all. See Red Hat Bug #164869. Fortunately (a) there's a workaround available (see the bug discussion), and (b) it looks like this is going to be fixed soon(ish).

The second set of breakage was that several terminal emulators (xterm, konsole) are being installed setgid utempter. /var/run/utmp is a file which stores information on who is currently using the system. In order to protect it from malicious (or ignorant) users, it's protected by being only accessible to the utempter group, so in order to be able to be able to update it, terminal emulators need to be installed setgid.

I think this is stupid. It violates the rule of least surprise, and makes it difficult to debug environment variable problems in one's window session. Most of the workarounds are ugly hacks (like setting LD_LIBRARY_PATH in .bashrc, which is not what .bashrc is for).

Fortunately, I'm not the only one who thinks it's stupid, and now a change request has been filed against utempter, although goodness knows when it'll actually get fixed. See Red Hat Bug #246063.

The workaround for these problems I'm using at the moment is environment modules. I described the procedure to set it up in a posting to the gEDA user list.

Friday, June 29, 2007

Hacking in the unstable branch

For most of June, gEDA has been in the process of switching from using CVS to git for version control, and quite a lot of developers' time seems to have been spent on getting used to the new system. I really like the new system; it makes reviewing changes and maintaining patchsets so much less of a chore, and my productivity has gone through the roof!

I've made really good progress on the component library and Scheme work which I said that I had planned to do during June:

The component selector preview widget now works properly for all possible component library backends. My changes coincidentally fix a security hole exploitable by malicious library distributors (they could use the library directory RC file mechanism to execute arbitrary code on your system).
system-gafrc is now installed by libgeda, and loads separate Scheme files for the default libraries and default font. Hopefully, this should make it easier to avoid using the default symbol library at all if, for some reason, you don't want it.
I zapped a bunch of deprecated Guile functions, and changed the configure scripts to check for Guile by version, but it's pretty boring stuff and I don't really have any motivation to finish it of at the moment.
Replacing the directory and command component library backends with equivalent Scheme implementations has had to be suspended, due to the fact that the Guile functions for working with pipes really don't work very well on Windows, and Windows support looks like it's going to become more important over the next few months.
I've replaced the s_clib_glob() with a similar function, s_clib_search(), which can operate either in exact matching mode or glob matching mode. This will allow people to use symbols with glob special characters without strange breakage. I've also added a caching mechanism for the results of the search. I went through all the places where pointers to CLibSource or CLibSymbol structures might be cached, and tried to make sure that they weren't; this was to make it possible to add, remove and refresh component sources without causing breakage.

I have introduced a patch to my personal tree which adds a list of currently used symbols to the component selector. I find this really useful already for speeding up adding frequently-used parts to my schematics, such as resistors and capacitors. I'm also hoping to add a button to the component selector dialog which allows you to directly open a selected symbol for editing.

My recent work (and discussions on the mailing lists) have inspired two more ideas, about which I'll go into more detail on a future occasion:

Per-directory configuration contexts.
Overhaul of the embedding system to make it more useful.

Saturday, June 02, 2007

gEDA is moving to using git

A few days ago in response to other developer's comments/complaints on moving gEDA to using Guile 1.8, I suggested forking a stable branch of gEDA which would use Guile 1.6 while keeping the main development tree using Guile 1.8.

Ales decided this was a good idea, saying:

Here's the plan going forward:

I do want a stable/unstable branch/release arrangement going forward.

I do not want to use CVS to maintain this though.

So, I am going to go ahead and setup a git repository as the official repository of gEDA/gaf.

I think this is really good news, as git is so much better than CVS in many ways, most importantly (in my opinion) the speed and ease of branching, merging and maintaining patch sets. The web interface is better too.

One unresolved problem, though, is that Ales has not yet decided what to do about committing to the new unstable git branch (he's decided to maintain the stable branch himself, accepting patches by e-mail & SF tracker). There are a number of options.

The first option is the Linux kernel model: each person has their own repository, and the repository belonging to the head honcho (in gEDA's case Ales) is the "officia" repository. In this case, the approach would be as follows:

A select (probably less than five) group of people (let's call them maintainers) have their own repos on git.gpleda.org (or elsewhere), from which Ales "pulls" changes. These would likely be people Ales trusts not to break anything or commit anything he wouldn't like to see in the master tree quite yet. Each maintainer would probably have a "for-ales" branch, to make sure that only changes that are ready go upstream.
Developers who are not maintainers and don't mind using git could use git-mail (and associated tools, such as StGit's mailing tool) to send patches by e-mail to a maintainer, who would integrate them into their personal repository. This is good for people who only make changes infrequently, or new developers "on probation". Git provides very good tools for importing changes from mbox files, so it's not too much work for the maintainers.
Those developers who don't like git could use git-cvsserver to browse the history and generate patches (which are again sent to maintainers).

The second option is a "shared central repository", as used by the X.org project. This is very similar in workflow to CVS, where there is a single central repository.

Some people would have the ability to "push" to the central repository. They could do this either using git or the git-cvsserver. People could have their own branches in the central repository, probably named "userid-topic".
Other developers would need to send their patches by e-mail or using the SF.net trackers.

Because the rate at which gEDA is developed is currently quite slow, I think the second option would be best for now, as it would make for a smoother transition from CVS to git. Since it is easy to switch from one approach to the other (and back again, if need be), it will be possible to experiment with other options if problems crop up.

Friday, June 01, 2007

Abject failure to install Fedora 7

I was very excited about the release of Fedora 7 yesterday, as I've been a loyal Fedora user since Fedora Core 3 was released back in 2004.

You'll notice my use of the past tense. This is because I spent most of today trying to get Fedora 7 to install. I'm extremely glad I took a full backup of my files, as my various attempts to get the installer to run without crashing hosed my existing Fedora Core 6 system and bootloader. It was extremely random; despite having verified checksums of all the installation media, run the CD checks recommended and all previous versions of Fedora Core having installed very smoothly, Anaconda would randomly die or hang at some point during the installation process. When it crashed just after formatting my root partition, I gave up.

I was saved by two things: my careful backing up of /home before the attempted upgrade, and the ability to use my other box to burn a Kubuntu CD. I now have a system which behaves almost exactly like my previous system (hey, it runs KDE and Emacs, what more do I need) and gEDA even compiles.

I'll see how I get on with Kubuntu: other gEDA developers use Ubuntu with success, so hopefully I won't find any need to reinstall Fedora Core 6.

Update: I'm now using Fedora 7, having not really "got" the whole Ubuntu/Debian thing. Installing from a USB stick got around the CD problems I was having quite successfully.

Tuesday, May 29, 2007

New library system now in CVS

After the SEUL server move was successfully completed, Ales released a new snapshot of gaf. With that out of the way I was given the all clear to merge my component library work.

I've therefore nuked the "libraries" branch from my public git repository, as it was no longer relevant.

There are a number of things I want to get sorted out before the next release (which is likely to be soon after the next code sprint on 9th June):

Rationalise the Scheme files in libgeda and geda-symbols. At the moment, the core rc file for gEDA is installed by the symbol package, and has loads of configure substitions in it. My plan is to:
- Move all configure substitutions into a single file, config.scm, which would be installed with libgeda;
- Provide some utility functions for common tasks like building paths from components;
- Move system-gafrc from geda-symbols to libgeda, and split out the default component library & font setup code into geda-symbols.scm and geda-fonts.scm, which would be installed with geda-symbols.
Other improvements to the Scheme files could be carried out later, but this would lay the ground-work.
While on the subject of Scheme, gEDA currently uses a lot of Guile functions & macros which are marked as deprecated in the current stable branch of Guile, 1.8.x. Since Ales has given the all-clear to move gaf to Guile 1.8, there are number of housekeeping tasks to do:
- Update gaf configure scripts to check for presence of Guile 1.8 or later. Although currently the scripts check for the presence of certain functions, this is really difficult for developers, because it's something of a moving target, and it's hard to keep the configure scripts up-to-date with all of the functions people have used. It seems a lot more sensible to deal with it the same way as the GTK+ dependency has been dealt with: specify a particular version of Guile to support, and then make sure people do not use functions which require a newer version. Since Guile is very stable, this doesn't seem unreasonable.
- Flush out all uses of deprecated interfaces, and generally make sure that we are using Guile the Right Way.
A longer-term aim would be to use "guile-snarf" to generate code for registering Guile procedures made available by libgeda & gschem's C code. This would reduce the number of places where code would have to be changed when adding, removing or modifying a Guile procedure, and would hopefully make life easier for people new to the way libgeda & gschem work with libguile.
Replace the C code which currently implements the directory and command component library backends with equivalent Scheme code. This would make the component library code much simpler, and would have the added advantage of providing examples of how to implement new component library backends. This would very much require both a full upgrade to Guile 1.8.x, as well as the aforementioned reorganization of Scheme files.
Optimisation of look-ups. A simple optimisation would be to cache the results of s_clib_glob(), probably in a hashtable.

I'm intending to leave the next big killer feature, component categorization, until the release after next.

There are two big things which I'd like to be fixed before the next release that I don't think I'll have time to do:

Fix the preview widget to work properly with the new component back-ends. Currently, it needs to have a pathname in order to load the symbol for display, but it should be able to use the o_read_buffer() function I recently added to libgeda.
Make gschem be able to open schematic or symbol files not associated with a file (probably read-only). This would be much harder, but is more-or-less required for using gschem as the graphical front-end of a design database system (a future idea).

These are a couple of jobs which follow on from each other quite intuitively, and would make a big difference to my work. If no-one steps forward to do them, I'll probably leave them for the release after next.

Anyway, none of the aforementioned will happen before my exams, which finish on the 8th June, so I guess I'll have an awful lot of coding to get done between the code sprint and the next release!