Big build farm on a budget

Tue, Mar 16, 2021 11-minute read

Being able to push a change to a server which builds and tests it can free you up to do other things, but getting the machines together to create a build farm for any large build can be expensive. If you have a long build, like the Android Open Source Project, a “from clean” build can take hours even when you have a powerful machine and a well designed build.

There is a way to reduce those costs significantly though using a reference build and the overlay filesystem.

Reference Builds

All changes to a codebase share a common ancestor; The branch which contains all previously merged changes.

I have a single powerful build machine; It checks out the branch of the repository where all previously approved changes have been merged and performs a cold build. It takes a long time, but any change that I’m working on is, when compared to the whole codebase, small, and so having a well-designed build allows me to quickly rebuild only things related to that small change.

With many repositories this would be the git main branch, with the AOSP this could be a version specific branch such as android11-qpr1-release, the important thing is that there is a common ancestor commit on the primary revision contol branch which is only a small number of commits away from my change (it may even be the direct parent of my changes commit).

The collection of the source code, revision control system information (e.g. the .git directories), and build output is what I call the reference build, and we can use it in the same way as reference frames are used to reduce the work of video decoders to reduce the amount of work in change-checking builds.

Overlay Filesystem

The next piece of the puzzle is the overlay filesystem. Unlike most filesystems that folk use the overlay filesystem doesn’t specify how to lay-out blocks of data in a disk partition; Instead it allows you to take a directory which should not be modified, provide a separate directory where any changes to the unmodifiable directory should be stored, and then it gives you a combined representation of the two.

In OverlayFS terminology the unmodifiable directory is called the lower filesystem, and the area for the changes is called the upper filesystem. If you think of these as glass plates if something has been written to the upper layer you will see that instead of anything on the lower layer, but, if nothing is on the upper layer you’ll be able to see through to what’s on the lower layer.

The best working example of how it works is one of its original design use cases; consumer configurable devices. When you update the firmware on a device like, say, your home broadband router, you’re providing it with a filesystem image which contains all the code the router needs to run and area which contains a set of configuration files which contain the device defaults. This configuration area forms the lower layer.

When you make configuration change the lower layer is not changed, but, instead, the change is written to the upper layer, which only exists on your device in a SSD-like storage area. When a program asks for a configuration file the overlay filesystem will give it the version from your local upper layer with your changes. If you’ve not made any configuration changes the program gets the file with the system defaults from the lower layer which was part of the firmware image.

When you perform a “factory reset” all that happens is the upper layer is deleted and all the programs on the device see the lower layer, which contains the system defaults. This is how many devices can perform a factory reset when you boot them while holding down a reset button; early on in the boot process the check if the button is being held, and if it is they delete the upper layer and reboot so only the lower layer is used.

Putting the two together

So, by now, you’ve probably seen where this is going; If you create a reference build which becomes the lower layer in an Overlay Filesystem, and provide an empty upper layer, you can pull the change you want to test, build it, and, if the build is well-designed, it will only rebuild the parts which have changed rather than needing to rebuild everything.

You may be wondering how much of a speed up this can give. My main build machine is a 12 core, 48 GB RAM, SSD backed machine which takes about 2 hours and 30 minutes to do a full AOSP build. Using the output from that as my reference build, and performing a simple single file change, I can get a build with the change from a ~10 year old Core i5-2320, 16 GB RAM, SSD backed machine in under 10 minutes. That last time I tried to do a full build on the Core i5 machine I gave up after 5 hours.

Distributing the reference build

The area that needs the most consideration is distributing the reference builds. With a build like the AOSP a tar file of the source code and build output can be tens or hundreds of gigabytes, so pushing a tar file of the reference build to every single build machine can take a long time, and even using something like rsync can be a slow process.

To avoid lots of large data transfers I use NFS, the Network File System, because a lot of build tool “Up to date” checks involve just checking file metadata (e.g. the last modified time), and, with NFS, my build machine can provide that information to the less powerful machines without the need to transfer the entire file contents.

My big build machine has four areas which are exported via NFS; A configuration export, and a set of three areas for the builds. This takes up a lot of space, but, for me, it’s worth the investment in some SSDs to make these available in a resilient way.

The reason I have three build areas is to avoid breaking builds on the less powerful machines. The build areas get moved through the stages;

  • current - Where the most recently completed build is.
  • legacy - Where the previously completed build is.
  • recyclable - where the next build should take place.

Having the legacy area means that, if a less powerful machine is doing something which runs for a long time, and the big build machine completes one build and starts the next, the less powerful machine doesn’t see the reference build it’s using disappear (unless the job it’s doing takes longer than two complete builds, in which case that job needs to be worked on).

How I make this work

The build areas on my big build machine (called big-builder) are called build-1, build-2, build-3. The configuration area (called build-config) contains a single file (called latest-build) which has the name of the area containing the most recent build (i.e. it contains build-2 if the most recent build was in build-2). The less powerful machines have the following entries in /etc/fstab to ensure they have the build config and build areas mounted;

big-builder:/build-config   /mnt/build-config    nfs   nofail   0    0
big-builder:/build-1        /mnt/build-1         nfs   nofail   0    0
big-builder:/build-2        /mnt/build-2         nfs   nofail   0    0
big-builder:/build-3        /mnt/build-3         nfs   nofail   0    0

I use the nofail option to allow the less powerful machines to boot without the big build server being available. That way if I’m not doing AOSP work I can leave the power hungry beast turned off.

Each less powerful machine has a couple of local directories which are used by the overlay filesystem; /mnt/work and /mnt/upper, and a directory in which the build takes place /mnt/build

When a less powerful machine needs to perform a build it runs a script, similar to the one below, as root (which is necessary to perform the mount);

#!/bin/bash

CURRENT_BUILD_AREA=`cat /mnt/build-config/latest-build`

# Remove any previous build
umount /mnt/build
rm -rf /mnt/work /mnt/upper

# Ensure the mount points exist
mkdir /mnt/work /mnt/upper /mnt/build

# Mount the overlay filesystem
mount -t overlay \
      -o lowerdir=/mnt/$CURRENT_BUILD_AREA,upperdir=/mnt/upper,workdir=/mnt/work,noatime,nodiratime \
      overlay \
      /mnt/build

After this has run we can start working in /mnt/build to check out the change we want to build.

The reason I include the revision control information in the reference build is so I can do the following;

$ cd /mnt/build
$ git fetch --all
$ git checkout interesting_thing
$ build
$ run_tests
...
$ profit

Due to the reference build being available the build step will trigger an incremental build which will take a few minutes instead of hours.

If you wanted to you can actively develop in /mnt/build, including pushing changes back to your source control repository. The only thing you need to remember is that you’ll need to stay with the same reference build, and if you want to use a more recent reference build you’ll need to delete your local upper and work areas which will delete any local changes. This means that, if you’re planning to do develop in an overlay filesystem supported area, you shouldn’t use a system of automatically rotating reference builds in the way I do, you should make sure you have a reference build that doesn’t get deleted until you’re finished using it.

AOSP quirks

It’s worth mentioning a small niggle I have with the AOSP build; It hasn’t been created to provide a fourth R I like to see in builds; being Relocatable.

The AOSP build keeps a track of the absolute path to where its been built, which means if you used /mnt/build-1, /mnt/build-2, /mnt/build-3 on your big build machine, your less powerful machines would do more work because they would be building in a different place (/mnt/build).

There are a some ways around this, none of which I particularly like, but they do ensure that the work done on less powerful machines is kept to a minimum.

The least performance impacting solution is to create a partition per build area (build-1, build-2, build-3 in my case) and then double mount the partition you want to build in. So if, for example, /dev/nvme0n1p1 is the partition for build-1, and it’s using the ext4 filesystem, your /etc/fstab file would contain;

/dev/nvme0n1p1   /mnt/build-1   ext4   noatime,nodiratime  0  1

then, before your build, you would run;

$ mount /dev/nvme0n1p1 /mnt/build

then build in /mnt/build to ensure the big build machine creates reference builds with the same path as the less powerful machines will be using when they build in their overlay filesystem.

The down-side of this is, of course, that each partition will take a fixed amount of disk space, so, as the build grows, you may need to repartition your disk to make more room.

Other ways of solving this issue involve using files instead of partitions, and then using loopback mounts, or building in the location the less powerful machines will and then rsync-ing the build to the exported areas, both of which will increase the time it takes for reference builds to become available.

Going next level

You might be wondering how well this scales, well, like anything, there’s a limit, and in this case it’s usually the machine serving the reference build.

If your reference build server is saturating the network between it and the less powerful machines you can look at upgrading your network speed, or splitting the network in separate subnets by having a multi-port ethernet adapter in your reference build serving machine and having the less powerful machines on a subnet with fewer machines on it.

If you find that your reference build machine is struggling to serve all the lesser build machines and perform builds you’ve got a couple of options; Firstly you can push completed builds to another server which is dedicated to serving reference builds (I’ve used a QNAP NAS to do this in the past), or secondly, upgrade your build server (which may be cheaper than getting a dedicated file serving machine).

If you start using a dedicated file serving machine you can then move towards replicating the reference builds between multiple file serving machines either in a hierarchy or in a more peer to peer fashion using something like ping-time weighted BitTorrent.

The important thing here is that you only need one really powerful machine; the one creating reference builds. The reference build file servers and change building machines can be a lot less powerful, and a lot cheaper to horizontally scale if you need extra capacity.

That’s all folks

Hopefully this has given you some ideas about how you can scale up your AOSP build and test system in a way which can serve multiple people without breaking the bank.

If you have questions or feedback you can find me on Mastodon, GitHub, and Twitter.