Setting up a Big-Sur Virtual Machine for Continuous Integration
Our CI infrastructure features two oldish macOS runners, that don't allow us to build binaries for the Apple M1 architecture. An upgrade is desperately needed...
As of Pd-0.52, the official macOS binaries distributed at Miller Puckette's website are built on the iem-ci, since I've set this up to do automatic code signing (and notarization) of applications using the iem's developer certificates. This helps Apple users, as they don't have to go through hoops to get Pd running.
The macOS runners on the CI are VMs running macOS 10.12 Sierra resp macOS 10.14 Mojave, with Xcode versions that were current back then.
As such, they are not capable of building arm64 binaries for the Apple M1.
People have complained on the Pd-list about that (although I still think that it is a good thing that the default macOS download is currently x86_64 only,
as there are no externals available on deken yet for the arm64 architecture), and I figured it is time to setup a modern runner.
Also (somewhat) recently, brew started to complain bitterly that the Xcode version installed is way too old.
Requirements
We are using GitLab-CI as the software to drive the iem-ci.
I'd like to enforce isolation of CI-runs.
This rules out having a simple ssh runner that runs directly (bare-metal) on a macOS host.
Since there is no native Docker for macOS (or: there was none back then; and I guess you can't easily get a Docker image with Xcode installed anyhow),
I settled on using VMs.
When setting up our initial macOS CI we bought a refurbished Mac Mini 5,2 (mid 2011) (yikes: happy 10th birthday!)
to be used as the host for the macOS runners (given that Apple only allows macOS to run on Apple hardware).
I did not want to invest in Parallels (partly because I do not like paying for licenses; and partly because of the missing native support in GitLab-CI), so I checked out VirtualBox. I've been using VirtualBox on my desktop for a while - it makes it really easy to do cross platform development. However, I quickly decided that running VirtualBox on a macOS host was not much fun, so I picked my beloved Debian as the host OS (which happily runs on the Mac Mini), and then run macOS in VMs. That should satisfy the requirement to only run macOS on Apple hardware (even if there is a Debian GNU/Linux layer involved somewhere in between).
Anyhow, that is long gone.
The current task was to get a modern macOS (either macOS11 Big Sur or macOS12 Monterey) to run in a VM on that old hardware.
Vagrant
A fine way to get Virtual Machine images for all kind of tasks is HashiCorp's Vagrant. Vagrant gives us an abstraction layer to various virtualization providers (like QEMU, Docker, Parallels, VirtualBox,...) and preconfigured images (so called "boxes" in Vagrant lingo) for these providers, targeting the easy setup of Development Environments.
Using vagrant is ridiculously simple. To download and boot a VM simply run something like.
1mkdir alpine38; cd alpine38
2vagrant init generic/alpine38
3vagrant up
The first command (besides the mkdir && cd stuff) creates a Vagrantfile that describes the box.
The second command downloads a disk image and some meta info, and then loads the named box (here: generic/alpine38) into your provider (e.g. VirtualBox) and boots it.
VMs are run in the background so there's no GUI attached to it, but if you need one you can open it via the VirtualBox control center.
More often, you just login to the running VM using ssh, with something like:
1vagrant ssh
Boxes are cached, so if you run vagrant up later, it won't download the disk image again (but you also need the disk-space for the cache!).
macOS Vagrant boxes
There are a couple of macOS boxes available for Vagrant.
I'm not 100% sure about the legal status of these images, but my understanding is that it's fine to download a macOS image (after all, Apple provides upgrades for free themselves), as long as you only ever run them on Apple hardware. In any case, there have been macOS images available on Vagrant for ages, so apparently it is fine (or nobody really cared).
Unfortunately, I haven't readily found any macOS12 Monterey images for VirtualBox, but at least there are a handful macOS11 Big Sur boxes available.
There are great differences in size (from 11GB to 40GB) and which software is preinstalled (nothing; Xcode, brew, ...). So I tried all of them (using the latest and greatest version of each), with varying success.
Some of the boxes would just boot cycle, others would boot without a Desktop. (My test machine is an ordinary Debian amd64 machine.)
| provider | version | notes |
|---|---|---|
| abelich/macos | v0.1.1 | boot stops in UEFI-shell 💥 |
| amarcireau/macos | v11.3.1 | boot cycles 💥 |
| nick-invision/macos-bigsur-base | v0.0.2 | boot cycles 💥 |
| tampham/automation-macos | v0.0.3 | boots, but no desktop cannot install things cannot access ~/Desktop/ and similar |
| thinhho/automation-macos | v0.0.4 | boots, but no desktop
|
So obviously, most of the machines don't work for me, but at least two of them boot and allow me to login.
My original bet was tampham/automation-macos,
as it is supposed to come with a more-or-less up-to-date version of Big Sur (11.4), has Xcode 12.5 and Python3
already installed, and it has a decent size (11.3GB).
It turned out to be not so nice in the end, as neither Xcode nor Python3 were installed.
Worse, I couldn't actually install anything on the machine, as it refused to mount .dmg images and the like
with some generic error code (which I haven't noted).
Those problems are probably solvable on the Desktop, but the machine wouldn't boot into a GUI.
As an added "bonus", I wasn't even able to access the ~/Desktop/ folder on the machine via ssh.
AFAICT this is some Apple security feature, to prevent exploits. :-(
Anyhow, the thinhho/automation-macos box - albeit 32GB - does work and already has Xcode and Python3 installed.
It also comes with a full node.js development stack and OpenJDK and whatnot, which we don't need.
Cleaning up
There are quite some lumps of stuff on the disk that we won't ever use, so get rid of that first:
1# remove large installation files in the home-directory
2rm -rf ~vagrant/node_home ~vagrant/automation-projects ~vagrant/.npm ~vagrant/.appium
3find ~vagrant/Library/Developer/CoreSimulator/Caches/dyld/ -depth -mindepth 1 -print -delete
4rm -rf ~vagrant/Library/Developer/Xcode/DerivedData ~vagrant/Library/Developer/Xcode/*DeviceSupport
5
6# remove node from /usr/local
7rm -f /usr/local/bin/np[mx]
8find /usr/local/ -depth -mindepth 1 -name "*node*" -exec rm -rf {} +
9find /usr/local/ -depth -mindepth 1 -type d -delete
I found that Vagrant provides a way to even run such cleanup "provisioning" scripts automatically when the machine is first setup, so I opted for this (it's always nice that you can simply reproduce the VM in case things go havoc)
1Vagrant.configure("2") do |config|
2 config.vm.box = "thinhho/automation-macos"
3 config.vm.box_version = "0.0.4"
4 config.vm.provider :virtualbox do |vb|
5 vb.name = "macOS11.4 Big Sur"
6 end
7 config.vm.synced_folder ".", "/vagrant", disabled: true
8
9 config.vm.provision "shell", path: "provision.sh"
10
11 config.vm.provision "file", source: "../gitlab-prebuild-cleanup", destination: "/tmp/"
12 config.vm.provision "shell", inline: "mkdir -p /usr/local/bin/; chmod a+x /tmp/gitlab-prebuild-cleanup; mv /tmp/gitlab-prebuild-cleanup /usr/local/bin/"
13end
the provision.sh script mentioned in the config, is basically the cleanup script you find above.
Finally, this also installs the gitlab-prebuild-cleanup script, which is run at the beginning of each GitLab-CI job
(and does various things like syncing the time or getting the latest gitlab-runner (required within the VM for uploading artifacts)).
The installation of the script is a two-step process:
- copy the file from the host into the VM's
/tmp/directory - within the VM run some inline bash-code that copies the script into the VM's
/usr/local/bin/directory
The reason for this complication is simply that the file-provisioning mode of Vagrant is run as an unprivileged user
who has no permissions to copy into /usr/local directly.
The full configuration can be found in https://git.iem.at/iem/iem-ci/-/tree/main/osx/bigsur
Installing things
As Xcode and Python3 are already installed, we only need to install brew.
This can be done with our gitlab-prebuild-cleanup script,
but a fresh installation of brew requires a bit of interactivity,
so I just ran that script manually in an ssh session after the machine was provisioned.
The initial run of brew takes a while...
1/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"
2brew update
3brew install automake cmake pkg-config gettext libtool ninja wget
As you can see, I install some build systems that are often used on our CI runners.
Originally this list also included python@3.9 and thii/xcbuild/xcbuild,
but the former is already installed on the Vagrant box
and I think the latter was never really used here (and it failed to build...).
Deploying the image
I did all the Vagrant tests on my Desktop machine, rather than the runner host. Once satisfied, we need to deploy the VM to the destination host.
This is simple enough:
-
power down the VM
-
copy the entire VirtualBox folder to the target (using
rsync)1rsync -avz macOS11 root@lovelace:/home/gitlab-runner/VMs/ -
on the target open the
.vboxfile withVBoxManage1VBoxManage /home/gitlab-runner/VMs/macOS11/macOS11.vmdk -
boot the machine
As a pleasant surprise, the VM obviously detected it was running on real Apple hardware, and even started up the Desktop.
Finalizing
To reduce boot-time, I use prebooted VMs for my runners.
For this I create two snapshots:
gitlab-basethe powered off machine with everything installed that we are likely to need for a CI system. This VM is mostly needed for software upgrades every know and then, and in case something goes wrong (I've seen running VMs not start after a VirtualBox upgrade).gitlab-cia snapshot of the running VM (booted fromgitlab-baseand then waited till the VM does not show any disk activity any more)
gitlab-runner
Now we only need to register the new VM as a runner, using gitlab-runner register
1[[runners]]
2 name = "bigsur@lovelace"
3 url = "https://git.iem.at/"
4 token = "SECRET"
5 executor = "virtualbox"
6 pre_clone_script = "sudo /usr/local/bin/gitlab-prebuild-cleanup -t; sleep 1; sudo /usr/local/bin/gitlab-prebuild-cleanup -t; sudo /usr/local/bin/gitlab-prebuild-cleanup -u; sudo /usr/local/bin/gitlab-prebuild-cleanup -cU"
7 [runners.custom_build_dir]
8 [runners.cache]
9 [runners.cache.s3]
10 [runners.cache.gcs]
11 [runners.cache.azure]
12 [runners.ssh]
13 user = "vagrant"
14 password = "PASSWORD"
15 disable_strict_host_key_checking = true
16 [runners.virtualbox]
17 base_name = "bigsur"
18 base_snapshot = "gitlab-ci"
19 base_folder = ""
20 disable_snapshots = true
notes
- the
pre_clone_scriptcleans up the VM from any leftovers of the provisioning, syncs the clock via NTP and updates the gitlab-runner application. - in
runners.virtualboxwe declare thegitlab-cisnapshot of the booted VM asbase_snapshot - I cannot remember the exact details, but I think
disable_snapshotsforks off a new (linked) VM rather than using snapshots of snapshots, which I liked for some reasons (probably the runner would like to keep snapshots for each project, which would quickly exhaust the available disk space). - we definitely need
disable_strict_host_key_checking = true. Starting with (the upcoming) Gitlab-15, the default will be to enforcestrict_host_key_checking. Unfortunately this does not work with the VirtualBox runner, as the ssh-port we are connecting is an ephemeral port (the port-forwarding is done at runtime), so the host value changes with each run - making strict host key checking pointless. (UNLESS the host key checking could be made to ignore the port; but I don't think this is possible