Wednesday, February 22, 2017

Deploying buildbot workers on Windows Server 2016

At LiveCode, we use a buildbot system to perform our continuous integration and release builds. Recently, we moved from building our Windows binaries in a Linux container using Wine to building on a native Windows system running in an Azure virtual machine.

Deploying buildbot on Windows is not totally straightforward, and the documentation for installing it is quite hard to follow. It's quite important to us that our build infrastructure is reproducible, so we wanted to have a procedure that could bring up a buildbot worker on a newly-allocated server quickly and with as little manual intervention as possible.

This blog post provides step-by-step instructions for installing buildbot 0.8.12 on Windows Server 2016 Datacenter Edition, with explanations of what's going on at each step. The target configuration is a buildbot worker that runs as an unprivileged user and communicates with the buildbot master over an SSL tunnel. All of the commands are written using PowerShell. It's recommended to run them via the 'PowerShell ISE' application, running as a user in the 'Administrators' group. The full script is available as a GitHub Gist.

Although this describes installing buildbot 0.8.12, there's no reason it shouldn't work for buildbot 0.9.x. If you try it, please let me know how you get on in the comments.

Note: Don't run these commands unless you've checked them very carefully first. They're adapted from the scripts used for our buildbot deployment, and may not work as you expect. You should use them as the basis of your own installation script and test it thoroughly before using it in production.

Support functions

First, ensure that the script stops immediately if any error is thrown, and that "verbose" messages are displayed.

$VerbosePreference = 'Continue'
$ErrorActionPreference = 'Stop'

By default, PowerShell doesn't convert non-zero exit codes from subprocess into errors, so define a helper function that you can use to accomplish this. By default, CheckLastExitCode will throw an error on a non-zero exit code, but if there are other exit codes that should be considered successful, you can pass in an array of permitted exit codes, e.g. CheckLastExitCode(@(0,10)).

function CheckLastExitCode {
    param ([int[]]$SuccessCodes = @(0))
    if ($SuccessCodes -notcontains $LASTEXITCODE) {
        throw "Command failed (exit code $LASTEXITCODE)"
    }
}

For this to work, you'll need to implement a Fetch-BuildbotResource function that obtains a named resource file and places it in a given output location. Fill in the blanks (possibly with some sort of Invoke-WebRequest):

function Fetch-BuildbotResource {
    param([string]$Path,
          [string]$OutFile)
    # Your code goes here
}

It's also a good idea to activate Windows. The virtual machines provisioned by Azure may not have been activated; this command will do so automatically.

cscript.exe C:\Windows\System32\slmgr.vbs /ato

Finally, define variables with the root path for the buildbot installation and the IP or DNS address of the buildbot master, and create the buildbot worker's root directory

$k_buildbot_root = 'C:\buildbot'
$k_buildbot_master = 'buildbot.example.org'

New-Item -Path $k_buildbot_root -ItemType Container -Force | Out-Null

Installing programs with Chocolatey

Chocolatey is a package manager for Windows that can automatically install a variety of applications and services in much the same way as the Linux `apt-get`, `dnf` or `yum` programs. Here, you can use it for installing Python (for running buildbot) and for installing the stunnel SSL tunnel service.

Install Chocolatey by the time-honoured process of "downloading a random script from the Internet and running it as a superuser".

$env:ChocolateyInstall = 'C:\ProgramData\chocolatey'

# Install Chocolatey, if not already present
if (!(Test-Path -LiteralPath $env:ChocolateyInstall -PathType Container)) {
    Invoke-WebRequest 'https://chocolatey.org/install.ps1' -UseBasicParsing | Invoke-Expression
}

Next, use Chocolatey to install stunnel and Python 2.7:

Write-Verbose 'Installing Python and stunnel'
choco install --yes stunnel python2
CheckLastExitCode

Installing Python modules and buildbot

It's easiest to install buildbot and its dependencies using the pip Python package manager.

Write-Verbose 'Installing Python modules'
$t_pip = 'C:\Python27\Scripts\pip.exe'
& $t_pip install pypiwin32 buildbot-slave==0.8.12
CheckLastExitCode

The pypiwin32 package installs some DLLs that are required for buildbot to run as a service, but when installed with pip, these DLLs are not automatically registered in the Windows registry. This caused me at least a day of wondering why my buildbot service was failing to start with the super informative message:

Luckily, pypiwin32 installs a script that will set everything up properly.

Write-Verbose 'Registering pywin32 DLLs'
$t_python = C:\Python27\python.exe
& $t_python C:\Python27\Scripts\pywin32_postinstall.py -install

SSL tunnel service

You'll need to configure stunnel to run on your buildbot master, and listen on port 9988. I recommend configuring the buildbot master's stunnel with a certificate, and then making sure workers always fully authenticate the certificate when connecting to it. This will prevent people from obtaining your workers' login credentials by impersonating the buildbot master machine.

Write-Verbose 'Installing buildbot-stunnel service'
$t_stunnel = 'C:\Program Files (x86)\stunnel\bin\stunnel.exe'
$t_stunnel_conf = Join-Path $k_buildbot_root 'stunnel.conf'
$t_stunnel_crt  = Join-Path $k_buildbot_root 'buildbot.crt'

# Fetch the client certificate that will be used to authenticate
# the buildbot master
Fetch-BuildbotResource `
    -Path 'buildbot/stunnel/master.crt' -Outfile $t_stunnel_crt

# Create the stunnel configuration file
Set-Content -Path $t_stunnel_conf -Value @"
[buildbot]
client = yes
accept = 127.0.0.1:9989
cafile = $t_stunnel_crt
verify = 3 
connect = $k_buildbot_master:9988
"@

# Register the stunnel service, if not already present
if (!(Get-Service buildbot-stunnel -ErrorAction Ignore)) {
    New-Service -Name buildbot-stunnel `
        -BinaryPathName "$t_stunnel -service $t_stunnel_conf" `
        -DisplayName 'Buildbot Secure Tunnel' `
        -StartupType Automatic
}

The buildbot worker instance

Creating and configuring the worker instance, and setting up buildbot to run as a Windows service, are the most complicated part of the installation process. Before dealing with the Windows service, instantiate a worker with the info it needs to connect to the buildbot master.

First, set up a bunch of values that will be needed later. The worker's name will just be the name of the server it's running on, and it will be configured to use a randomly-generated password.

Write-Verbose 'Initialising buildbot worker'

# Needed for password generation
Add-Type -AssemblyName System.Web

$t_buildbot_worker_script = 'C:\Python27\Scripts\buildslave'

$t_worker_dir = Join-Path $k_buildbot_root worker
$t_worker_name = "$env:COMPUTERNAME-$_"
$t_worker_password = `
    [System.Web.Security.Membership]::GeneratePassword(12,0)
$t_worker_admin = 'Example Organisation'

Run buildbot to actually instantiate the worker. We have to manually check the contents of the standard output from the setup process, because the exit status isn't a reliable indicator of success.

$t_log = Join-Path $k_buildbot_root setup.log
Start-Process -Wait -NoNewWindow -FilePath $t_python `
    -ArgumentList @($t_buildbot_worker_script, 'create-slave', `
        $t_worker_dir, 127.0.0.1, $t_worker_name,
        $t_worker_password) `
    -RedirectStandardOutput $t_log

# Check log file contents
$t_expected = "buildslave configured in $t_worker_dir"
if ((Get-Content $t_log)[-1] -ne $t_expected) {
    Get-Content $t_log | Write-Error
    throw "Build worker setup failed (exit code $LASTEXITCODE)"
}

It's helpful to provide some information about the host and who administrates it.

Set-Content -Path (Join-Path $t_worker_dir 'info\admin') `
    -Value $t_worker_admin
Set-Content -Path (Join-Path $t_worker_dir 'info\host') `
    -Value (Get-WmiObject -Class Win32_OperatingSystem).Caption

While testing our Windows-based buildbot workers, I found that I was getting "slave lost" errors during many build steps. I found that getting the workers to send really frequent "keep alive" messages to the build master prevented this from happening almost entirely. I used a 10 second period, but you might find that unnecessarily frequent.

$t_config = Join-Path $t_worker_dir buildbot.tac
Get-Content $t_config | `
    ForEach {$_ -replace '^keepalive\s*=\s*.*$', 'keepalive = 10'} | `
    Set-Content "$t_config.new"
Remove-Item $t_config
Move-Item "$t_config.new" $t_config

Configuring the buildbot service

Now for the final part: getting buildbot to run as a Windows service. It's a bad idea to run the worker as a privileged user, so this will create a 'BuildBot' user with a randomly-generated password, configure the service to use that account, and make sure it has full access to the worker's working directory.

Some of the commands used in this section expect passwords to be handled in the form of "secure strings" and some expect them to be handled in the clear. There's a fair degree of shuttling between the two representations.

Once again, begin by setting up some variables to use during these steps.

Write-Verbose 'Installing buildbot service'

$t_buildbot_service_script = 'C:\Python27\Scripts\buildbot_service.py'
$t_service_name = 'BuildBot'
$t_user_name = $t_service_name
$t_full_user_name = "$env:COMPUTERNAME\$t_service_name"

$t_user_password_clear = `
    [System.Web.Security.Membership]::GeneratePassword(12,0)
$t_user_password = `
    ConvertTo-SecureString $t_user_password_clear -AsPlainText -Force

Create the 'BuildBot' user:

$t_user = New-LocalUser -AccountNeverExpires `
    -PasswordNeverExpires `
    -UserMayNotChangePassword `
    -Name $t_user_name `
    -Password $t_user_password

You need to create the buildbot service by running the installation script provided by buildbot. Although there's a New-Service command in PowerShell, the pywin32 support for services written in Python expects a variety of registry keys to be set up correctly, and it won't work properly if they're not.

& $t_python $t_buildbot_service_script `
    --username $t_full_user_name `
    --password $t_user_password_clear `
    --startup auto install
CheckLastExitCode

It's still necessary to tell the service where to find the worker directory. You can do this by creating a special registry that the service checks on startup to discover its workers.

$t_parameters_key = "HKLM:\SYSTEM\CurrentControlSet\Services\$t_service_name\Parameters"
New-Item -Path $t_parameters_key -Force
Set-ItemProperty -Path $t_parameters_key -Name "directories" `
    -Value $t_worker_dir

Although the service is configured to start as the 'BuildBot' user, that user doesn't yet have the permissions required to read and write in the worker directory.

$t_acl = Get-Acl $t_worker_dir
$t_access_rule = New-Object `
    System.Security.AccessControl.FileSystemAccessRule `
    -ArgumentList @($t_full_user_name, 'FullControl', `
        'ContainerInherit,ObjectInherit', 'None', 'Allow')
$t_acl.SetAccessRule($t_access_rule)
Set-Acl $t_worker_dir $t_acl

Granting 'Log on as a service' rights

Your work is nearly done! However, there's one task that I have not yet worked out how to automate, and still requires manual intervention: granting the 'Buildbot' user the right to log on as a service. Without granting this right, the buildbot service will fail to start with a permissions error.

  1. Open the 'Local Security Policy' tool
  2. Choose 'Local Policies' -> 'User Rights Assignment' in the tree
  3. Double-click on 'Log on as a service' in the details pane
  4. Click 'Add User or Group', and add 'BuildBot' to the list of accounts

Time to launch

Everything should now be correctly configured!

There's one final bit of work required: you need to add the worker's username and password to the buildbot master's list of authorised workers. If you need it, you can obtain the username and password for the worker using PowerShell:

Get-Content C:\buildbot\worker\buildbot.tac | `
    Where {$_ -match '^(slavename|passwd)' }

You can use the `Start-Service` command to start the stunnel and buildbot services:

Start-Service buildbot-stunnel
Start-Service buildbot

Conclusions

You can view the full script described in this blog post as a GitHub Gist.

On top of installing buildbot itself, you'll need to install the various toolchains that you require. If you're using Microsoft Visual Studio, the "build tools only" installers provided by Microsoft for MSVC 2010 and MSVC 2015 are really useful. Many other dependencies can be installed using Chocolatey.

Installing buildbot on Windows is currently a pain, and I hope that someone who knows more about Windows development than I do can help the buildbot team make it easier to get started.

No comments: