Friday, October 24, 2025

Implementing case-lambda for SKILL++

For a few years now, my day job at Cadence Design Systems has been working on Virtuoso Schematic Editor. Virtuoso is the most widely-used tool for designing analogue and mixed-signal integrated circuits. For reasons of commercial confidentiality, I haven't been able to write much about my work.

Virtuoso can be customized extensively by writing code in the SKILL programming language. This is a Lisp. It actually comes in 2 variants:

  • SKILL, which is a variant of Common Lisp
  • SKILL++, which is a variant of ANSI Scheme.

I do as much of my work as possible in SKILL++, and in recent years I have created a bunch of internal development tools to help me with that. This has included implementing some of the Scheme Requests For Implementation (SRFIs), because they describe well-thought-out and useful libraries that SKILL++ doesn't provide.

I recently came across a situation where I really needed the ability to define a procedure that could be called in two different ways. When this procedure was originally added decades ago, it could only be called with a lot of positional arguments, something like:

(someFunc 1 2 3 4 5 6 7 8)

Most callers of the procedure only ever need to pass meaningful values to a few of these parameters. This meant that a lot of the calls looked like this:

(someFunc nil nil nil nil nil importantValue nil nil)

This produced verbose code and hard-to-read code, and it was impossible to correctly write code using this procedure without having the product manual to hand to know where to put the importantValue. Was it the 5th or 6th parameter?

So, we wanted to allow myFunc to be called with named arguments, to make the code more concise and self-explanatory. Something like:

(someFunc ?sixthParam importantValue)

However, this presented a challenge. How do you make a function that can be called both with positional arguments (because we don't want to break any existing code that uses it), and with keyword arguments (for new, improved usage?

Designing caseLambda for SKILL

SRFI-16, Syntax for procedures of variable arity, provides the outline of a solution. It proposes the case-lambda syntax, that creates a function that dispatches between alternatives depending on how many arguments it takes.

(define plus
  (case-lambda 
    (() 0)
    ((x) x)
    ((x y) (+ x y))
    ((x y z) (+ (+ x y) z))
    (args (apply + args))))

Two properties are immediately notable:

  1. SRFI-16 only defines syntax that works for clauses with fixed numbers of positional arguments, or which accept any number of arguments. For my use-case, I would need to extend the syntax.
  2. The rule is first match, i.e. when called with some set of arguments, the first clause which could be called with those arguments are selected. If there were 2 clauses with 5 arguments, only the first would ever actually end up getting called.

SKILL has built-in syntax for allowing procedures to be defined with either optional or keyword arguments:

;; Function that writes to a file if given, otherwise to standard output
(define (writeString text @optional filename)
  (let ((port (if filename (outfile filename) poport)))
    (write text port)))
    
;; Using keyword arguments to allow specifying port or filename
(define (writeString text @key port filename)
  (let ((port (cond
                (port port)
                (filename (outfile filename))
                poport)))
    (write text port)))

The final design challenge was related to the way that keyword arguments work in SKILL. In SKILL, any symbol that starts with a ? character is a keyword and is self-evaluating. It's otherwise just a regular value. This means that a function that accepts positional parameters can happily consume a keyword as a regular argument. Take the built-in list constructor, for example:

(list ?hello world)
--> (?hello world)

The current recommendation in the community of Scheme implementers is that you shouldn't have self-evaluating keywords, and that keywords should have special handling rather than being treated as ordinary data. Well, that genie fled the bottle in 1993; no putting it back in now!

So, the design I settled on was very strict first match: the first clause of the caseLambda that can be called with the provided argument list will be selected, no matter what. This requires some caution in writing functions that use it:

(define ohNo
  (caseLambda
    ((x y)      'positional)
    ((@key x y) 'keyword)))

(ohNo ?x 42)
--> positional

I came up with some guidelines to get the best success when using SKILL caseLambda:

  1. Put your clauses in increasing order of number of required arguments, followed by increasing order of optional arguments.
  2. Put @key clauses before @optional clauses.
  3. N-ary clause (with @rest clauses) will almost always prevent any subsequent clauses from being reachable, so you can really have at most one N-ary clause per caseLambda.

Implementation challenges

SRFI-16's reference implementation is written entirely in syntax-rules. This is neat, because you can get case-lambda without any special support from your Scheme implementation!

However, this didn't seem like a good approach for adding caseLambda to SKILL.

When a SKILL function is called, the SKILL bytecode virtual machine already has to check the arguments passed against the function's parameter list to make sure the call is valid, and so the information about the positional and keyword parameters of each function are baked into every function object. I decided it would be better to make a generic matcher that could use this information, than to write a complicated metaprogram to generate a custom matcher for each caseLambda.

I therefore decided to implement caseLambda as a very simple macro that compiles each clause into a function object, and then dispatches to a primitive to choose between the clauses at runtime. Something almost but not exactly like:

(define-syntax caseLambda
  (syntax-rules ()
    ((caseLambda "CLAUSE" (formals expr ...))
     (lambda formals expr ...))
    ((caseLambda clause ...
     (lambda args
       (apply (%caseLambda-match% args (caseLambda "CLAUSE" clause) ...)
              args)))))

This approach is faster to compile, easier to implement and maintain, and generates much smaller bytecode. All complexity is contained in the matcher primitive, which is reused for every caseLambda function.

I suspect that it would have been even better to make caseLambda into compiler builtin syntax and add "case function" objects into the data model, so that the bytecode virtual machine could choose between clauses simultaneously with validating the argument list for the call. As far as I can tell, most Scheme implementations that have both keyword-based function calls and case-lambda implement it as a builtin.

Conclusions

I'd been hankering after this language feature for a few years, but I'd been putting it off because it seemed unreasonably difficult. I was fiddling with something related when I suddenly had the inspiration for how to do this and, in the end, it only took me a couple of days to get it to the point that I feel happy using it in production.

Perhaps you are a customer CAD team engineer who happens to be reading this and thinking, "This sounds useful! Can I use it?" Alas, the answer is no -- pretty much all of the work I've been doing on SKILL++ libraries and development tools is for use only by Cadence R&D. Sorry!

Wednesday, February 22, 2017

Deploying buildbot workers on Windows Server 2016

At LiveCode, we use a buildbot system to perform our continuous integration and release builds. Recently, we moved from building our Windows binaries in a Linux container using Wine to building on a native Windows system running in an Azure virtual machine.

Deploying buildbot on Windows is not totally straightforward, and the documentation for installing it is quite hard to follow. It's quite important to us that our build infrastructure is reproducible, so we wanted to have a procedure that could bring up a buildbot worker on a newly-allocated server quickly and with as little manual intervention as possible.

This blog post provides step-by-step instructions for installing buildbot 0.8.12 on Windows Server 2016 Datacenter Edition, with explanations of what's going on at each step. The target configuration is a buildbot worker that runs as an unprivileged user and communicates with the buildbot master over an SSL tunnel. All of the commands are written using PowerShell. It's recommended to run them via the 'PowerShell ISE' application, running as a user in the 'Administrators' group. The full script is available as a GitHub Gist.

Although this describes installing buildbot 0.8.12, there's no reason it shouldn't work for buildbot 0.9.x. If you try it, please let me know how you get on in the comments.

Note: Don't run these commands unless you've checked them very carefully first. They're adapted from the scripts used for our buildbot deployment, and may not work as you expect. You should use them as the basis of your own installation script and test it thoroughly before using it in production.

Support functions

First, ensure that the script stops immediately if any error is thrown, and that "verbose" messages are displayed.

$VerbosePreference = 'Continue'
$ErrorActionPreference = 'Stop'

By default, PowerShell doesn't convert non-zero exit codes from subprocess into errors, so define a helper function that you can use to accomplish this. By default, CheckLastExitCode will throw an error on a non-zero exit code, but if there are other exit codes that should be considered successful, you can pass in an array of permitted exit codes, e.g. CheckLastExitCode(@(0,10)).

function CheckLastExitCode {
    param ([int[]]$SuccessCodes = @(0))
    if ($SuccessCodes -notcontains $LASTEXITCODE) {
        throw "Command failed (exit code $LASTEXITCODE)"
    }
}

For this to work, you'll need to implement a Fetch-BuildbotResource function that obtains a named resource file and places it in a given output location. Fill in the blanks (possibly with some sort of Invoke-WebRequest):

function Fetch-BuildbotResource {
    param([string]$Path,
          [string]$OutFile)
    # Your code goes here
}

It's also a good idea to activate Windows. The virtual machines provisioned by Azure may not have been activated; this command will do so automatically.

cscript.exe C:\Windows\System32\slmgr.vbs /ato

Finally, define variables with the root path for the buildbot installation and the IP or DNS address of the buildbot master, and create the buildbot worker's root directory

$k_buildbot_root = 'C:\buildbot'
$k_buildbot_master = 'buildbot.example.org'

New-Item -Path $k_buildbot_root -ItemType Container -Force | Out-Null

Installing programs with Chocolatey

Chocolatey is a package manager for Windows that can automatically install a variety of applications and services in much the same way as the Linux `apt-get`, `dnf` or `yum` programs. Here, you can use it for installing Python (for running buildbot) and for installing the stunnel SSL tunnel service.

Install Chocolatey by the time-honoured process of "downloading a random script from the Internet and running it as a superuser".

$env:ChocolateyInstall = 'C:\ProgramData\chocolatey'

# Install Chocolatey, if not already present
if (!(Test-Path -LiteralPath $env:ChocolateyInstall -PathType Container)) {
    Invoke-WebRequest 'https://chocolatey.org/install.ps1' -UseBasicParsing | Invoke-Expression
}

Next, use Chocolatey to install stunnel and Python 2.7:

Write-Verbose 'Installing Python and stunnel'
choco install --yes stunnel python2
CheckLastExitCode

Installing Python modules and buildbot

It's easiest to install buildbot and its dependencies using the pip Python package manager.

Write-Verbose 'Installing Python modules'
$t_pip = 'C:\Python27\Scripts\pip.exe'
& $t_pip install pypiwin32 buildbot-slave==0.8.12
CheckLastExitCode

The pypiwin32 package installs some DLLs that are required for buildbot to run as a service, but when installed with pip, these DLLs are not automatically registered in the Windows registry. This caused me at least a day of wondering why my buildbot service was failing to start with the super informative message:

Luckily, pypiwin32 installs a script that will set everything up properly.

Write-Verbose 'Registering pywin32 DLLs'
$t_python = C:\Python27\python.exe
& $t_python C:\Python27\Scripts\pywin32_postinstall.py -install

SSL tunnel service

You'll need to configure stunnel to run on your buildbot master, and listen on port 9988. I recommend configuring the buildbot master's stunnel with a certificate, and then making sure workers always fully authenticate the certificate when connecting to it. This will prevent people from obtaining your workers' login credentials by impersonating the buildbot master machine.

Write-Verbose 'Installing buildbot-stunnel service'
$t_stunnel = 'C:\Program Files (x86)\stunnel\bin\stunnel.exe'
$t_stunnel_conf = Join-Path $k_buildbot_root 'stunnel.conf'
$t_stunnel_crt  = Join-Path $k_buildbot_root 'buildbot.crt'

# Fetch the client certificate that will be used to authenticate
# the buildbot master
Fetch-BuildbotResource `
    -Path 'buildbot/stunnel/master.crt' -Outfile $t_stunnel_crt

# Create the stunnel configuration file
Set-Content -Path $t_stunnel_conf -Value @"
[buildbot]
client = yes
accept = 127.0.0.1:9989
cafile = $t_stunnel_crt
verify = 3 
connect = $k_buildbot_master:9988
"@

# Register the stunnel service, if not already present
if (!(Get-Service buildbot-stunnel -ErrorAction Ignore)) {
    New-Service -Name buildbot-stunnel `
        -BinaryPathName "$t_stunnel -service $t_stunnel_conf" `
        -DisplayName 'Buildbot Secure Tunnel' `
        -StartupType Automatic
}

The buildbot worker instance

Creating and configuring the worker instance, and setting up buildbot to run as a Windows service, are the most complicated part of the installation process. Before dealing with the Windows service, instantiate a worker with the info it needs to connect to the buildbot master.

First, set up a bunch of values that will be needed later. The worker's name will just be the name of the server it's running on, and it will be configured to use a randomly-generated password.

Write-Verbose 'Initialising buildbot worker'

# Needed for password generation
Add-Type -AssemblyName System.Web

$t_buildbot_worker_script = 'C:\Python27\Scripts\buildslave'

$t_worker_dir = Join-Path $k_buildbot_root worker
$t_worker_name = "$env:COMPUTERNAME-$_"
$t_worker_password = `
    [System.Web.Security.Membership]::GeneratePassword(12,0)
$t_worker_admin = 'Example Organisation'

Run buildbot to actually instantiate the worker. We have to manually check the contents of the standard output from the setup process, because the exit status isn't a reliable indicator of success.

$t_log = Join-Path $k_buildbot_root setup.log
Start-Process -Wait -NoNewWindow -FilePath $t_python `
    -ArgumentList @($t_buildbot_worker_script, 'create-slave', `
        $t_worker_dir, 127.0.0.1, $t_worker_name,
        $t_worker_password) `
    -RedirectStandardOutput $t_log

# Check log file contents
$t_expected = "buildslave configured in $t_worker_dir"
if ((Get-Content $t_log)[-1] -ne $t_expected) {
    Get-Content $t_log | Write-Error
    throw "Build worker setup failed (exit code $LASTEXITCODE)"
}

It's helpful to provide some information about the host and who administrates it.

Set-Content -Path (Join-Path $t_worker_dir 'info\admin') `
    -Value $t_worker_admin
Set-Content -Path (Join-Path $t_worker_dir 'info\host') `
    -Value (Get-WmiObject -Class Win32_OperatingSystem).Caption

While testing our Windows-based buildbot workers, I found that I was getting "slave lost" errors during many build steps. I found that getting the workers to send really frequent "keep alive" messages to the build master prevented this from happening almost entirely. I used a 10 second period, but you might find that unnecessarily frequent.

$t_config = Join-Path $t_worker_dir buildbot.tac
Get-Content $t_config | `
    ForEach {$_ -replace '^keepalive\s*=\s*.*$', 'keepalive = 10'} | `
    Set-Content "$t_config.new"
Remove-Item $t_config
Move-Item "$t_config.new" $t_config

Configuring the buildbot service

Now for the final part: getting buildbot to run as a Windows service. It's a bad idea to run the worker as a privileged user, so this will create a 'BuildBot' user with a randomly-generated password, configure the service to use that account, and make sure it has full access to the worker's working directory.

Some of the commands used in this section expect passwords to be handled in the form of "secure strings" and some expect them to be handled in the clear. There's a fair degree of shuttling between the two representations.

Once again, begin by setting up some variables to use during these steps.

Write-Verbose 'Installing buildbot service'

$t_buildbot_service_script = 'C:\Python27\Scripts\buildbot_service.py'
$t_service_name = 'BuildBot'
$t_user_name = $t_service_name
$t_full_user_name = "$env:COMPUTERNAME\$t_service_name"

$t_user_password_clear = `
    [System.Web.Security.Membership]::GeneratePassword(12,0)
$t_user_password = `
    ConvertTo-SecureString $t_user_password_clear -AsPlainText -Force

Create the 'BuildBot' user:

$t_user = New-LocalUser -AccountNeverExpires `
    -PasswordNeverExpires `
    -UserMayNotChangePassword `
    -Name $t_user_name `
    -Password $t_user_password

You need to create the buildbot service by running the installation script provided by buildbot. Although there's a New-Service command in PowerShell, the pywin32 support for services written in Python expects a variety of registry keys to be set up correctly, and it won't work properly if they're not.

& $t_python $t_buildbot_service_script `
    --username $t_full_user_name `
    --password $t_user_password_clear `
    --startup auto install
CheckLastExitCode

It's still necessary to tell the service where to find the worker directory. You can do this by creating a special registry that the service checks on startup to discover its workers.

$t_parameters_key = "HKLM:\SYSTEM\CurrentControlSet\Services\$t_service_name\Parameters"
New-Item -Path $t_parameters_key -Force
Set-ItemProperty -Path $t_parameters_key -Name "directories" `
    -Value $t_worker_dir

Although the service is configured to start as the 'BuildBot' user, that user doesn't yet have the permissions required to read and write in the worker directory.

$t_acl = Get-Acl $t_worker_dir
$t_access_rule = New-Object `
    System.Security.AccessControl.FileSystemAccessRule `
    -ArgumentList @($t_full_user_name, 'FullControl', `
        'ContainerInherit,ObjectInherit', 'None', 'Allow')
$t_acl.SetAccessRule($t_access_rule)
Set-Acl $t_worker_dir $t_acl

Granting 'Log on as a service' rights

Your work is nearly done! However, there's one task that I have not yet worked out how to automate, and still requires manual intervention: granting the 'Buildbot' user the right to log on as a service. Without granting this right, the buildbot service will fail to start with a permissions error.

  1. Open the 'Local Security Policy' tool
  2. Choose 'Local Policies' -> 'User Rights Assignment' in the tree
  3. Double-click on 'Log on as a service' in the details pane
  4. Click 'Add User or Group', and add 'BuildBot' to the list of accounts

Time to launch

Everything should now be correctly configured!

There's one final bit of work required: you need to add the worker's username and password to the buildbot master's list of authorised workers. If you need it, you can obtain the username and password for the worker using PowerShell:

Get-Content C:\buildbot\worker\buildbot.tac | `
    Where {$_ -match '^(slavename|passwd)' }

You can use the `Start-Service` command to start the stunnel and buildbot services:

Start-Service buildbot-stunnel
Start-Service buildbot

Conclusions

You can view the full script described in this blog post as a GitHub Gist.

On top of installing buildbot itself, you'll need to install the various toolchains that you require. If you're using Microsoft Visual Studio, the "build tools only" installers provided by Microsoft for MSVC 2010 and MSVC 2015 are really useful. Many other dependencies can be installed using Chocolatey.

Installing buildbot on Windows is currently a pain, and I hope that someone who knows more about Windows development than I do can help the buildbot team make it easier to get started.

Tuesday, February 21, 2017

How to stop mspdbsrv from breaking your continuous integration system

Over the last month, I've been working on getting the LiveCode build cluster to do Windows builds using Visual Studio 2015. We've been using Visual Studio 2010 since I originally set up the build service in mid-2015. This upgrade was prompted by needing support for some C++ language features used by the latest version of libskia.

Once the new Windows Server buildbot workers had their tools installed and were connected to the build service, I noticed a couple of pretty weird things going on:

  • after one initial build, the build workers were repeated failing to clean the build tree in preparation for for the next build
  • builds were getting "stuck" after completing successfully, and were then being detected as timed out and forcibly killed

Blocked build tree cleanup

The first problem was easy to track down. I guessed that the clean step was failing because some process still had an open file handle to one of the files or directories that the worker was trying to delete. I used the Windows 'Resource Monitor' application (resmon.exe), which can be launched from the 'Start' menu or from 'Task Manager', to find the offending process. The 'CPU' tab lets you search all open file handles on the system by filename, and I quickly discovered that mspdbsrv.exe was holding a file handle to one of the build directories.

What is mspdbsrv?

mspdbsrv is a helper service used by the Visual Studio C and C++ compiler, cl.exe; it collects debugging information for code that's being compiled and writes out .pdb databases. CL automatically spawns mspdbsrv if debug info is being generated and it connect to an existing instance. When the build completes, CL doesn't clean up any mspdbsrv that it spawned; it just leaves it running. There's no way to prevent CL from doing this.

So, it looked like the abandoned mspdbsrv instance had its current working directory set to one of the directories that the build worker was trying to delete, and on Windows you can't delete a directory if there's a process running there. So much for the first problem.

Build step timeouts

The second issue was more subtle -- but it also appeared to be due to the lingering mspdbsrv process! I noticed that mspdbsrv was actually holding a file handle to one of the buildbot worker's internal log files. It appears that buildbot doesn't close file handles when starting build processes, and these handles were being inherited by mspbdsrv, which was holding them open. As result, the buildbot worker (correctly) inferred that there were still unfinished build job processes running, and didn't report the build as completed.

Mismatched MSVC versions

When I thought through this a bit further, I realised there was another problem being caused by lingering mspdbsrv instances. Some of the builds being handled by the Windows build workers need to use MSVC 2015, and some still need to use MSVC 2010. Each type of build should use the corresponding version of mspdbsrv, but by default CL always connects to any available service process.

Steps towards a fix

So, what was the solution?

  1. Run mspdbsrv explicitly as part of the build setup, and keep a handle to the process so that it can be terminated once the build has finished.

  2. Launch mspdbsrv with a current working directory outside the build tree.
  3. Force CL to use a specific mspdbsrv instance rather than just picking any available one.

LiveCode CI builds are now performed using a Python helper script. Here's a snippet that implements all of these requirements (note that it hardcodes the path to the MSVC 2010 mspbdsrv.exe:

import os
import subprocess
import uuid

# Find the 32-bit program files directory
def get_program_files_x86():
    return os.environ.get('ProgramFiles(x86)',
                          os.environ.get('ProgramFiles',
                                         'C:\\Program Files\\'))

# mspdbsrv is the service used by Visual Studio to collect debug
# data during compilation.  One instance is shared by all C++
# compiler instances and threads.  It poses a unique challenge in
# several ways:
#
# - If not running when the build job starts, the build job will
#   automatically spawn it as soon as it needs to emit debug symbols.
#   There's no way to prevent this from happening.
#
# - The build job _doesn't_ automatically clean it up when it finishes
#
# - By default, mspdbsrv inherits its parent process' file handles,
#   including (unfortunately) some log handles owned by Buildbot.  This
#   can prevent Buildbot from detecting that the compile job is finished
#
# - If a compile job starts and detects an instance of mspdbsrv already
#   running, by default it will reuse it.  So, if you have a compile
#   job A running, and start a second job B, job B will use job A's
#   instance of mspdbsrv.  If you kill mspdbsrv when job A finishes,
#   job B will die horribly.  To make matters worse, the version of
#   mspdbsrv should match the version of Visual Studio being used.
#
# This class works around these problems:
#
# - It sets the _MSPDBSRV_ENDPOINT_ to a value that's probably unique to
#   the build, to prevent other builds on the same machine from sharing
#   the same mspdbsrv endpoint
#
# - It launches mspdbsrv with _all_ file handles closed, so that it
#   can't block the build from being detected as finished.
#
# - It explicitly kills mspdbsrv after the build job has finished.
#
# - It wraps all of this into a context manager, so mspdbsrv gets killed
#   even if a Python exception causes a non-local exit.
class UniqueMspdbsrv(object):
    def __enter__(self):
        os.environ['_MSPDBSRV_ENDPOINT_'] = str(uuid.uuid4())

        mspdbsrv_exe = os.path.join(get_program_files_x86(),
            'Microsoft Visual Studio 10.0\\Common7\\IDE\\mspdbsrv.exe')
        args = [mspdbsrv_exe, '-start', '-shutdowntime', '-1']
        print(' '.join(args))
        self.proc = subprocess.Popen(args, cwd='\\', close_fds=True)
        return self

    def __exit__(self, type, value, traceback):
        self.proc.terminate()
        return False
You can then use this when implementing a build step:
with UniqueMspdbsrv() as mspdbsrv:
    # Do your build steps here (e.g. msbuild invocation)
    pass

# mspdbsrv automatically cleaned up by context manager

It took me a couple of days to figure out what was going on and to find an adequate solution. A lot of very tedious trawling through obscure bits of the Internet were required to find all of the required pieces; for example, Microsoft do not document arguments to mspdbsrv or the environment variables that it understands anywhere on MSDN.

Hopefully, if you are running into problems with your Jenkins or buildbot workers interacting weirdly with Microsoft Visual Studio C or C++ builds, this will save you some time!

Monday, December 05, 2016

When C uninitialised variables and misleading whitespace combine

Recently, LiveCode Builder has gained a namespace resolution operator .. It allows LCB modules to declare functions, constants, and variables which have the same name, by providing a way for modules that import them to distinguish between them.

During this work, we ran into a problem: the modified LCB compiler (lc-compile) worked correctly in "Debug" builds, but reliably crashed in "Release" builds. More peculiarly, we found that lc-compile crashes depended on which compiler was used: some builds using certain versions of GCC crashed reliably, while some builds using clang worked fine. We spent a lot of time staring at output from gdb and Valgrind, and came to the conclusion that maybe it was a compiler bug in GCC.

It turned out that we were wrong. When we switched to using clang to build full LiveCode releases, the mysterious crashes popped up again. Since this had now become a problem that was breaking the build, I decided to dig into it again. Originally, we'd not been able to duplicate the crash in very recent versions of GCC and clang, so my first step was to try and make lc-compile crash when compiled with GCC 6.

The problem seemed to revolve around some code in the following form:

class T;
typedef T* TPtr;

// (1) function returning true iff r_value was set
bool maybeFetch(TPtr& r_value);

void f()
{
    TPtr t_value;
    if (maybeFetch(t_value))
    {
        // (2) dereference t_value
    }
}

lc-compile was sometimes, but not reliably, crashing at point (2).

Initially, when I compiled with GCC 6, I was not able to induce a crash. However, I did receive a warning that t_value might be used without being initialised. I therefore modified the implementation of f() to initialise t_value at its declaration:

void f()
{
    TPtr t_value = nullptr;
    // ...
}

With that modification, the crash became reliably reproducible in all build modes using all of the compilers I had available. This drew my suspicion to the maybeFetch() function (1). The function's API contract requires it to return true if (and only if) it sets its out parameter r_value, and return false otherwise.

So, I had a look at it, and it looked fine. What else could be going wrong?

Much of lc-compile is implemented using a domain-specific language called Gentle, which generates bison and flex grammars, which are in turn used to generate some megabytes of C code that's hard to read and harder to debug.

I disappeared into this code for quite a while, and couldn't find anything to suggest that the Gentle grammar was wrong, or that the generated code was the cause of the segfault. What I did find suggested that there were problems with the values being provided by the maybeFetch() function.

Because explicit initialisation made the crashes reliable and reproducible, I came to the conclusion that maybeFetch() was sometimes returning true without setting its out parameter. So, what was maybeFetch() doing?

A simplified form of maybeFetch() as I found it was:

bool maybeFetch(TPtr& r_value)
{
    for (TPtr t_loop_var = /* loop form ... */)
    {
        if (condition(t_loop_var))
            *r_value = t_loop_var;
            return true;
    }
    return false;
}

Needless to say, when I saw the problem it was moment of slightly bemused hilarity. This function had been reviewed several times by various team members, and all of us had missed the missing block braces { ... } hidden by misleading indentation.

if (condition(t_loop_var))
{ // missing open brace
    *r_value = t_loop_var;
    return true;
} // missing close brace

Once these braces had been inserted, all of the problems went away.

What lessons could be taken away from this?

  1. The bug itself eluded review because of misleading indentation. GCC 6 provides a "misleading indentation" warning which would have immediately flagged up this warning if it had been enabled. We do not use GCC 6 for LiveCode builds; even if we did, we wouldn't be able to enable the "misleading indentation" warning to good effect because the LiveCode C++ sources don't currently use a consistent indentation style. This problem could maybe be avoided if LiveCode builds enforced a specific indentation style (in which case the bug would have been obvious in review), or if we regularly did builds with GCC 6 and -Werror=misleading-indentation.
  2. The effect of the bug was an API contract violation, where the relationship between the return value and the value of an out parameter wasn't satisfied. The problem could have been avoided if the API contract was expressed in a way that the compiler could check. C++17 adds std::optional, which combines the idea of "is there a value or not" with returning the value itself. If the function took the form std::optional maybeFetch() then it would have been impossible for it to claim to return a value without actually returning one.
  3. Finally, the problem was obfuscated by failing to initialise stack variables. Although, if maybeFetch() was working correctly, the pointer on the stack would get initialised before use, in this case, it didn't. Diagnosing the problem may have been much easier if we routinely initialised stack variables to suitable uninformative values at the point of declaration, even if we think they _should_ get initialised via a function's out parameter before use.

This was a very easy mistake to make, and an easy issue to miss in a code review, but it was very costly to clean up. I hope that we'll be able to make some changes to our development processes and our coding style to try and avoid things like this happening in the future.

Update: My colleague points out another contributing factor to making the error hard to spot: the condition(t_loop_var) was a composite condition spread across multiple lines, i.e.

if (conditionA(t_loop_var) ||
    (conditionB(t_loop_var) &&
     conditionC(t_loop_var)))

    *r_value = t_loop_var;
    return true;

This code layout maybe makes it slightly less obvious where the body of the if lies.

Thursday, October 27, 2016

Playing with Bus1

David Herrmann and Tom Gundersen have been working on new, performant Linux interprocess communication (IPC) proposals for a few years now. First came their proposed kdbus system, which would have provided a DBus-compatible IPC system, but this didn't actually get merged because of several design issues that couldn't be worked around (mostly security-related).

So, they went back to the drawing board, and now have come back with a new IPC system called Bus1, which was described in a LWN article back in August. Yesterday, they posted draft patches to the Linux kernel mailing list, and the kernel module and userspace libraries are available on GitHub for your convenience.

I decided to find out what's involved in getting the experimental Bus1 code up and running on my system. I run Fedora Linux, but broadly similar steps can be used on other Linux distributions.

Installing tools

The first thing to do is to install some development tools and headers.

dnf install git kernel-devel
dnf builddep kernel

I'm going to need git for getting the source code, and the kernel-devel development headers for compiling the Bus1 kernel module. The special dnf builddep command automatically fetches all of the packages needed for compiling a particular package — in this case, we're compiling a kernel module, so just grabbing the tools needed for compiling the kernel should include everything necessary.

Building the kernel module

I need to get the Bus1 kernel module's source code using git:

mkdir ~/git
cd ~/git
git clone https://github.com/bus1/bus1.git
cd bus1

With all of the tools I need already installed, I can very simply run

make

to compile the Bus1 module.

Finally, the Bus1 Makefile provides an all-in-one solution for running the module's tests and loading it into the running kernel:

make tt

After several seconds of testing and benchmarking, I get some messages like:

[ 1555.889884] bus1: module verification failed: signature and/or required key missing - tainting kernel
[ 1555.891534] bus1: run selftests..
[ 1555.893530] bus1: loaded

Success! Now my Linux system has Bus1 loaded into its kernel! But what can be done with it? I need some userspace code that understands how to use Bus1 IPC.

Building the userspace library

The Bus1 authors have provided a basic userspace library for use when writing programs that use Bus1. How about building it and running its tests to check that Bus1 is actually usable?

Some additional tools are needed for compiling libbus1, because it uses GNU Autotools rather than the kernel build system:

sudo dnf install autoconf automake

As before, I need to checkout the source code:

cd ~/git
git clone https://github.com/bus1/libbus1.git

I can then set up its build system and configure the build by running:

./autogen.sh
./configure

But there's a problem! I need to install a couple of obscure dependencies: David Herrmann's c-sundry and c-rbtree libraries.

This is accomplished by something along the lines of:

cd ~/git
git clone https://github.com/c-util/c-sundry.git
git clone https://github.com/c-util/c-rbtree
# Install c-sundry
cd ~/git/c-sundry
./autogen.sh
./configure
make
sudo make install
# Install c-rbtree
cd ~/git/c-rbtree
./autogen.sh
./configure
make
sudo make install

So, with dependency libraries installed, it's now possible to build libbus1. Note that the configure script won't pick up the dependencies installed because on Fedora it doesn't scan the /usr/local/lib/pkgconfig directory by default, so I have to give it a bit of help.

cd ~/git/libbus1
./autogen.sh
PKG_CONFIG_PATH=/usr/local/lib/pkgconfig ./configure
make

Amusingly, this failed the first time due to a bug for which I submitted a patch. However, with the patch applied to c-sundry, I've got a successful build of libbus1!

I also ended up having to add /usr/local/lib to /etc/ld.so.conf so that the c-rbtree library got detected properly when running the libbus1 test suite.

Even after that, unfortunately the test suite failed. Clearly the Bus1 userspace libraries aren't as well-developed as the kernel module! Maybe someone could do something about that...?

Wednesday, November 04, 2015

Japanese Shioyaki-style mackerel

This is a guest post written by Kathryn Grant, who has a knack for picking out exotic yet easy-to-cook recipes!

This is a quick version of 鯖の塩焼き (saba no shioyaki), or salt-grilled mackerel, served with cucumber pickle and toasted sesame seeds. This recipe serves 2 people.

Ingredients

For the cucumber pickle:

  • ½ cucumber, halved lengthways and sliced
  • 1 tsp cooking salt
  • 50 ml rice wine (or white wine) vinegar
  • 3 tbsp dark soy sauce
  • 1 tbsp toasted sesame seed oil
  • 1 tsp sugar
  • ¼–½ tsp chilli powder
  • 3 spring onions

For the grilled mackerel:

  • 3 tbsp soy sauce
  • 1 tbsp rice wine (or white wine) vinegar
  • 1 tsp toasted sesame oil
  • 2 fresh mackerel fillets
  • Sea salt
  • Vegetable oil

For the rice:

  • 120-180 g rice (depending on hunger levels)
  • 1 tbsp toasted sesame oil
  • Sea salt

To serve:

  • 2 tbsp black sesame seeds
  • Lemon wedges
  • Finely sliced daikon radish or other radish (optional)

Method

  1. Chop the cucumber and place in a bowl. Sprinkle 1 tsp cooking salt over the cucumber and leave for 5 minutes. Meanwhile, mix the marinade ingredients together: vinegar, soy sauce, toasted sesame oil, sugar and chilli powder. Chop the spring onion. Once the 5 minutes is up, rinse the cucumber thoroughly with cold water to remove the salt, drain and place back into the bowl. Pour over the marinade, add in the spring onions, cover with clingfilm and set aside somewhere cool.
  2. Mix the marinade for the mackerel: soy sauce, vinegar and toasted sesame oil. Pour into a shallow dish. Wash the fish and place, skin-side up, in the shallow dish. Leave to marinate for 10 minutes.
  3. Pre-heat the grill to a high heat.
  4. Toast the black sesame seeds in the bottom of a dry pan for around 2 minutes, taking care not to burn them. Remove from the heat and set aside.
  5. Shred the radish, if using.
  6. Boil a kettle of water. Heat 1 tbsp of toasted sesame oil in a saucepan. Wash the rice thoroughly, until the water runs clear, then add to the saucepan. Fry for 1 minute, stirring continuously to make sure the rice does not burn. Cover the rice with water, season with a pinch of salt and simmer for approximately 12 minutes (check packet instructions).
  7. Whilst the rice is cooking, remove the mackerel from the marinade and pat dry with paper towels to remove excess moisture. Sprinkle the non-skin side with sea salt and let the fish rest for 5 minutes.
  8. Prepare a tray for grilling: line a baking tray with foil and grease with about 1 tbsp of vegetable oil.
  9. After the fish has rested, place onto the baking tray (skin-side down) and grill for 5 minutes until the fish is cooked. The skin should be crispy and surface lightly browned.
  10. Serve the cucumber pickle and rice sprinkled with the toasted sesame seeds. The excess cucumber marinade makes an excellent sauce for the rice. Serve the fish with lemon (or lime) wedges and shredded radish. The lemon/lime wedges really brings out the flavour of the fish.

Monday, October 19, 2015

Beetroot risotto

One of my most popular dishes is beetroot risotto. It's both the recipe that I get asked for most often, and the recipe that people go out of their way to tell me that they enjoyed making. Here's the (quite simple!) recipe so that you can enjoy it too!

This recipe serves two, and is especially good with some slices of pan-roasted duck breast on top. Yum.

Ingredients

  • 1 beetroot
  • 1 large carrot
  • 1 small onion
  • Half a celery stick
  • 1 garlic clove
  • 1 tbsp olive oil
  • 100 g risotto rice
  • 30 g Parmesan cheese
  • Butter
  • Fresh parsley
  • Salt & pepper

Method

First, peel the beetroot and carrot, and cut them into 1 cubes. Put them in a saucepan with a pinch of salt and enough water to cover them, bring them to the boil, and let them simmer for about 20 minutes.

While they're cooking, finely chop the onion, garlic and celery. Heat the olive oil in a large frying pan, and saute the chopped vegetables gently in the olive oil until they're soft and translucent. Also, grate the Parmesan, chop the parsley, and boil a full kettle.

Once the beetroot and carrot are cooked, strain off the liquid into a jug and set the vegetables to one side.

Turn up the heat in the frying pan, and gently fry the rice with the onion, garlic and celery for 1–2 minutes. Then add a little of the stock from cooking the beetroot and carrot (that you saved earlier in a jug), and stir the rice until almost all the liquid has been absorbed. Repeat until you run out of liquid. Add the root vegetables into the pan, and continue to gradually add hot water (from the kettle) while gently stirring until the rice is cooked.

Take the risotto of the heat, and stir in the Parmesan, the parsley, and a knob of butter. Let it rest for a minute, and serve in bowles with some freshly-ground black pepper on top!

Monday, October 12, 2015

Pan-roast venison haunch with pumpkin risotto

The rather awesome K. and I have been going out for three years! We made a special dinner to celebrate.

This recipe, unsurprisingly, serves two. Best accompanied by a nice Pinot Noir!

Ingredients

For the venison:

  • 12 oz (350 g) venison haunch, in one piece
  • 1 tbsp sunflower oil
  • 30 g butter
  • 25 ml gin
  • 1 tsp plain flour
  • 150 ml red wine
  • 300 ml lamb stock
  • 1 bay leaf
  • 1 small sprig rosemary
  • 5 juniper berries
  • Salt & pepper

For the risotto:

  • 1 tbsp olive oil
  • 1 onion
  • 2 cloves garlic
  • 1 celery stick
  • 300 g pumpkin
  • Some kale (a generous handful)
  • 100 g risotto rice
  • 150 ml white wine
  • 500 ml vegetable stock
  • 30 g Parmesan cheese
  • Butter
  • Salt & pepper

To serve:

  • Parsley leaves
  • Parmesan shavings

You will need a digital kitchen thermometer.

Method

I'm listing the two methods separately, but you'll need to do them simultaneously. Make sure you have all the equipment and ingredients ready before you start!

For the venison:

  1. At least an hour in advance, remove the venison from the fridge, remove all packaging, and pat dry with a clean paper towel. Place it on a clean chopping board and leave to dry in the air.
  2. Put a roasting tin in the oven and preheat to 120 °C fan. Heat the sunflower oil in a heavy-based frying pan over a high heat.
  3. Season the venison with salt and pepper. Fry the venison for 1–2 minutes on each side until sealed and browned. Add the butter to the the pan and baste continuously for 3 minutes, turning occasionally, then transfer it to the preheated roasting tin in the oven.
  4. While the venison is in the oven, make sure to check it periodically with the thermometer — the aim is to reach 63 °C in the centre of the meat [1], but don't let it get any hotter than that, or it'll dry out! It'll need about 15–20 minutes in the oven.
  5. Deglaze the frying pan with the gin, then add the flour and mix to a paste. Add the red wine and herbs, and simmer over a high heat until reduced by half.
  6. Remove the rosemary (because otherwise it can overpower the other flavours), and add the lamb stock. Continue reducing until a sauce-like consistency is achieved. Sieve the gravy and set aside (but keep it warm!)
  7. Once the venison reaches the target temperature, remove it from the oven and cover it in foil to rest. Make sure to rest it for at least 5 minutes.

For the risotto:

  1. Finely chop the onion and celery, and crush the garlic. Dice the pumpkin into 1 cm cubes, and finely shred the kale. Grate the Parmesan.
  2. Heat the olive oil over a medium heat in a large, non-stick pan, and add the onion and celery. Saute the vegetables gently for about 5 minutes until they are soft but not browning.
  3. Add the garlic and rice, and continue to cook for 2–3 minutes.
  4. Turn the heat up to high, and add the white wine and some salt. Continue to cook, while stirring regularly and adding stock when the risotto starts to dry out.
  5. When the rice is starting to soften, add the pumpkin and kale. Continue to cook the risotto, adding liquid when needed, until the rice is soft but not mushy.
  6. Stir in the Parmesan and a generous knob of butter, and leave the risotto to rest for at least a minute.

To serve, carve the venison into thick, even rounds. Arrange the risotto and venison on pre-heated plates. Spoon a little of the gravy onto the venison, and top the risotto with freshly-ground black pepper, parsley leaves and Parmesan shavings.

[1] Getting the centre of the venison to 63 °C is recommended if you want to make sure that any bacteria or other nasties are fully killed off, and will result in having venison that's "done" — with a centre that's slightly pink but still deliciously tender. If you'd like medium-rare, aim for 57 °C.

Tuesday, September 29, 2015

Using C library functions from LiveCode Builder

This blog post is part of an ongoing series about writing LiveCode Builder applications without the LiveCode engine.

Currently, the LiveCode Builder (LCB) standard library is fairly minimal. This means that there are some types of task for which you'll want to go beyond the standard library.

In a previous post, I described how to use LiveCode's foundation library. This lets you access plenty of built-in LiveCode functionality that isn't directly exposed to LCB code yet.

Someone else's problem

Often someone's already wrapped the functions that you need in another program, especially on Linux. You can run that program as a subprocess to access it. In LiveCode Script, you could use the shell function to run an external program. Unfortunately, the LCB standard library doesn't have an equivalent feature yet!

On the other hand, the standard C library's system(3) function can be used to run a shell command. Its prototype is:

int system(const char *command);

In this post, I'll describe how LCB's foreign function interface lets you call it.

Declaring a foreign handler

As last time, you can use the foreign handler syntax to declare the C library function. The com.livecode.foreign provides some important C types.

use com.livecode.foreign

foreign handler _system(in pCommand as ZStringNative) \
      returns CInt binds to "system"

Some things to bear in mind here:

  • I've named the foreign handler _system because the all-lowercase identifier system is reserved for syntax tokens
  • The ZStringNative type automatically converts a LCB string into a null-terminated string into whatever encoding LiveCode thinks is the system's "native" encoding.
  • Because the C library is always linked into the LiveCode program when it's started, you don't need to specify a library name in the binds to clause; you can just use the name of the system(3) function.

Understanding the results

So, now you've declared the foreign handler, that's it! You can now just _system("rm -rf /opt/runrev") (or some other helpful operation). Right?

Well, not quite. If you want to know whether the shell command succeeded, you'll need to interpret the return value of the _system handler, and unfortunately, this isn't just the exit status of the command. From the system(3) man page:

The value returned is -1 on error (e.g., fork(2) failed), and the return status of the command otherwise. This latter return status is in the format specified in wait(2). Thus, the exit code of the command will be WEXITSTATUS(status). In case /bin/sh could not be executed, the exit status will be that of a command that does exit(127).

So if the _system handler returns -1, then an error occurred. Otherwise, it's necessary to do something equivalent to the WIFEXITED C macro to check if the command ran normally. If it didn't, then some sort of abnormal condition occurred in the command (e.g. it was killed). Finally, the actual exit status is extracted by doing something equivalent to the WEXITSTATUS C macro.

On Linux, these two macros are defined as follows:

#define WIFEXITED(status)     __WIFEXITED (__WAIT_INT (status))
#define WEXITSTATUS(status)   __WEXITSTATUS (__WAIT_INT (status))
#define __WIFEXITED(status)   (__WTERMSIG(status) == 0)
#define __WEXITSTATUS(status) (((status) & 0xff00) >> 8)
#define __WTERMSIG(status)    ((status) & 0x7f)
#define __WAIT_INT(status)    (status)

Or, more succinctly:

#define WIFEXITED(status)   (((status) & 0x7f) == 0)
#define WEXITSTATUS(status) (((status) & 0xff00) >> 8)

This is enough to be able to fully define a function that runs a shell command and returns its exit status.

module org.example.system

use com.livecode.foreign

private foreign handler _system(in pCommand as ZStringNative) \
      returns CInt binds to "system"

/*
Run the shell command  and wait for it to finish.
Returns the exit status of if the command completed, and nothing
if an error occurred or the command exited abnormally.
*/
public handler System(in pCommand as String) \
      returns optional Number

   variable tStatus as Number
   put _system(pCommand) into tStatus

   -- Check for error
   if tStatus is -1 then
      return nothing
   end if

   -- Check for abnormal exit
   if (127 bitwise and tStatus) is not 0 then
      return nothing
   end if

   -- Return exit status
   return 255 bitwise and (tStatus shifted right by 8 bitwise)

end module

Tip of the iceberg

This post has hopefully demonstrated the potential of LiveCode Builder's FFI. Even if you use only the C standard library's functions, you gain access to almost everything that the operating system is capable of!

Using a C function from LCB involves reading the manual pages to find out how the function should be used, and how best to map its arguments and return values onto LCB types; often, reading C library header files to understand how particular values should be encoded or decoded; and finally, binding the library function and providing a wrapper that makes it comfortable use from LCB programs.

LiveCode Builder can do a lot more than just making widgets and — as I hope I've demonstrated — can do useful things without the rest of the LiveCode engine. Download LiveCode 8 and try some things out!

Wednesday, September 23, 2015

Roasted vegetable and chickpea tagine

It's been a while since I last posted a recipe here! Recently I've been having quite a lot of success with this Morrocan-inspired vegetarian recipe.

This recipe makes 6 portions.

Ingredients

For the roasted vegetables:

  • 350 g new potatoes, halved
  • 1 fennel bulb, trimmed & cut into batons
  • 1 medium carrot, cut into chunks
  • 1 large red pepper, cut into chunks
  • 1 large red onion, cut into chunks
  • 3 tbsp exra-virgin olive oil
  • 1 tsp cumin seeds
  • 1 tsp fennel seeds
  • 1 tsp coriander seeds, crushed

For the sauce:

  • 4 garlic cloves, chopped
  • 400 g canned chopped tomatoes
  • 400 g canned chickpeas, drained and rinsed
  • 250 ml red wine
  • 1 pickled lemon, finely chopped
  • 0.5 tbsp harissa paste
  • 1 tsp ras el hanout
  • 1 cinnamon stick
  • 40 g whole almonds
  • 10 dried apricots, halved

To serve:

  • Greek-style yoghurt
  • 2 tbsp coriander, finely chopped

Method

Preheat the oven to 200 °C fan. Put all the ingredients for the roasted vegetables into a large, heavy roasting tin, season to taste, and toss together to coat the vegetables in oil and spices. Roast for 30 minutes until the potatoes are cooked through and the vegetables generally have a nice roasted tinge.

While the vegetables are roasting, heat a large pan over a medium heat. Fry the garlic for 20–30 seconds until fragrant. Add the remaining ingredients, bring to the boil, and simmer while the vegetables roast.

When the vegetables are roasted, add them to the sauce and stir. Return the sauce to the simmer for another 15–20 minutes.

Serve in bowls, topped with a dollop of yoghurt and some chopped coriander. Couscous makes a good accompaniment to this dish if you want to make it go further.