Setting up a CUDA workstation with Docker
Overview
Everybody does Artificial Intelligence, Machine Learning and Deep Fake these days. The iem is no exception, so I was tasked to setup a CUDA workstation for our users.
Hardware
I didn't specify the hardware (that was done by a colleague of mine who actually wants to use the system) and only had a quick check, whether it would be supported on linux, so yesterday I got a largish parcel on my table containing a "MIFCOM High-End Workstation" equipped with:
- Intel Xeon W-2255
- 128GB RAM
- NVIDIA GeForce RTX 3080
- ...
Nice.
The case is huge, and not even 19"-compatible. (It reminds me on my first big tower in the mid 90s. Computers have shrunk since then...) Let's hope we will find some place for the machine.
It also features a 2kW PSU to power an energy sucking graphics card. And Machine Learning gets better the longer you keep it running...
So, from an environmental perspective...Not entirely nice. 😟
Base System: Debian
There's really not much to say here. A standard installation of Debian 11 "bullseye" with my trusted USB stick gets the system up and running.
Standard tweaks
All apt sources in /etc/apt/sources.list.d/
apt-repositories can be listed in both the file /etc/apt/sources.list
or in any files with a .list extension in the /etc/apt/sources.list.d/ directory.
I very much prefer only a single scheme on my machines, so I move the standard repository configuration:
1mv /etc/apt/sources.list /etc/apt/sources.list.d/debian.list
etckeeper
The most important part of a server-setup is /etc.
All the data resides on some fileserver (and is being properly backed up),
and the software itself is installed via apt (so can easily be reinstalled).
But the sysadmin's work goes into /etc.
To have a backup and a rollback option, I track all the configuration with git,
using the etckeeper package.
This will automatically commit changes introduced by package changes (that is:
whenever something is installed/upgraded/uninstalled via apt), and will
also commit any pending changes each night.
However, I strive to keep autocommits to a minimum and commit whatever changes
I made to the configuration immediately. But humans tend to forget...
Anyhow, here's how I install it:
1cd /etc
2apt install etckeeper
3git config --add user.name "IEM network operation center / IOhannes m zmoelnig"
4git config --add user.email noc@iem.at
Standard packages
The default editor that comes on a minimal Debian installation is nano.
It's small and I hate it.
I do most of my coding work in emacs,
but that is huge so I usually don't use it for server administration.
Instead I always use vi when working via ssh. Weird.
For complex package selections I also like an interactive tool to work with apt,
for which I love aptitude.
And for whatever reasons must-haves like curl are also missing from the default installation.
And then there's some live-monitoring tools like htop and iftop.
1apt install vim aptitude curl htop iftop
2update-alternatives --config editor
Monitoring the system
For some basic monitoring of the host, I use collectd, which can collect arbitrary (numerical) data (even from sources that show up later, or vanish again).
I know that modern people prefer things like Prometheus, but I'm a bit old-fashioned. Apart from that I think that Prometheus is both overkill and bloated (I keep struggling with Prometheus' huge WAL-files on another machine: 6GB for 2 weeks of data? c'mon).
So collectd it is.
Because Debian's default of collectd comes with many plugins which I'll never need, I only install the
bare minimum by skipping all "Recommended" packages:
1apt install --no-install-recommends collect
What I really love about collectd is that it can forward the collected data
to a remote machine, my central monitoring host that shows me nice graphs in a webinterface.
We configure this by adding a file /etc/collectd/collectd.conf.d/network.conf
with the following content:
1LoadPlugin network
2<Plugin network>
3 <Server "MONITOR" "25826">
4 SecurityLevel Encrypt
5 Username "USERNAME"
6 Password "PASSWORD"
7 </Server>
8</Plugin>
USERNAME and PASSWORD of course need to match the values setup on the MONITOR machine.
Sometimes machines show up under multiple names, and the collecting monitor host
might pick up one that I don't like, so I force a name by setting it in
/etc/collectd/collectd.conf.d/hostname.conf:
1Hostname "lucier"
Finally restart the collectd process:
1systemctl restart collectd
Nvidia drivers
To unleash the full power of the GPU, we (unfortunately) must use the proprietary Nvidia drivers.
They are shipped in the non-free Debian repositories,
so we enable these with this /etc/apt/sources.list.d/nonfree.list file.
1deb http://deb.debian.org/debian/ bullseye contrib non-free
2deb-src http://deb.debian.org/debian/ bullseye contrib non-free
3
4deb http://security.debian.org/debian-security bullseye-security contrib non-free
5deb-src http://security.debian.org/debian-security bullseye-security contrib non-free
6
7deb http://deb.debian.org/debian/ bullseye-updates contrib non-free
8deb-src http://deb.debian.org/debian/ bullseye-updates contrib non-free
After running apt update (to register packages from the new sources),
I first install the nvidia-detect utility and run it, to see which driver package is required for the GeForce RTX 3800:
1# apt install nvidia-detect
2# nvidia-detect
3Detected NVIDIA GPUs:
468:00.0 VGA compatible controller [0300]: NVIDIA Corporation Device [10de:2216] (rev a1)
5
6Checking card: NVIDIA Corporation Device 2216 (rev a1)
7Your card is supported by the default drivers.
8Your card is also supported by the Tesla 460 drivers series.
9It is recommended to install the
10 nvidia-driver
11package.
12#
Cool, so we just dive right into it and install the nvidia-smi package,
which pulls in about 1GB of packages.
After rebooting the machine (so it can load the nvidia module instead
of the open-source nouveau driver), we should be up and running.
We can use nvidia-smi to check if everything is in order:
1$ nvidia-smi
2Fri Feb 11 09:48:17 2022
3+-----------------------------------------------------------------------------+
4| NVIDIA-SMI 460.91.03 Driver Version: 460.91.03 CUDA Version: 11.2 |
5|-------------------------------+----------------------+----------------------+
6| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
7| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
8| | | MIG M. |
9|===============================+======================+======================|
10| 0 GeForce RTX 3080 On | 00000000:68:00.0 On | N/A |
11| 30% 30C P8 3W / 320W | 1MiB / 10015MiB | 0% Default |
12| | | N/A |
13+-------------------------------+----------------------+----------------------+
14
15+-----------------------------------------------------------------------------+
16| Processes: |
17| GPU GI CI PID Type Process name GPU Memory |
18| ID ID Usage |
19|=============================================================================|
20| No running processes found |
21+-----------------------------------------------------------------------------+
Monitoring the GPU
I also want to monitor the GPU utilization. There's no collectd-plugin that comes with Debian, but I found collectd-cuda on GitHub. It had a bit of problems on another setup of mine (witch a somewhat faulty card), so now I'm using my own fork. The scripts collects GPU utilization, temperature, memory consumption and power draw.
Integrating the script is pretty easy.
Assuming we cloned it into /usr/local/lib/collectd-cuda, we add it to collectd
by creating a file /etc/collectd/collectd.conf.d/cuda.conf with the following
contents:
1LoadPlugin exec
2<Plugin exec>
3 Exec cudamon "/usr/local/lib/collectd-cuda/collectd_cuda.sh"
4</Plugin>
cudamon is a user that has access to the Nvidia device (being a member of the video group).
Don't forget to restart the collecting daemon with systemctl restart collectd
Docker
Debian comes with Docker packages, but while I'm a die-hard Debian kid, I switched to using upstream packages a while ago.
To use them, first install the GPG-key that is used to sign the repository:
1curl -fsSL https://download.docker.com/linux/debian/gpg | gpg --dearmor -o /usr/share/keyrings/docker-archive-keyring.gpg
Then add the repository to the apt-sources,
by creating a file /etc/apt/sources.list.d/docker.list
containing:
1deb [signed-by=/usr/local/share/keyrings/docker.asc] https://download.docker.com/linux/debian bullseye stable
After running the obligatory apt update
I can install Docker with:
1apt install docker-ce
To make our users be able to use Docker, make sure they are in the docker group:
1adduser noc docker
(Logout and Login again (or perform some other runes) to make your working session aware of the new group membership)
Finally you can test whether everything works by running the "Hello World" container:
1docker run hello-world
Monitoring with Docker
Each Docker container creates ephemeral network interfaces and mounts,
which I don't really want to see in my collectd stats,
so I just exclude them from the collection
via this /etc/collectd/collectd.conf.d/docker-ignore.conf:
1<Plugin df>
2 FSType overlay
3</Plugin>
4
5<Plugin interface>
6 Interface "/^veth/"
7 IgnoreSelected true
8</Plugin>
Restart the collecting daemon after making any changes.
All together now: Nvidia + Docker
In order to be able to use the CUDA hardware within the Docker container, we still need to glue them, using libnvidia-container
1curl -s -L https://nvidia.github.io/libnvidia-container/gpgkey > /usr/share/keyrings/libnvidia-container-keyring.asc
1deb [signed-by=/usr/share/keyrings/libnvidia-container-keyring.asc] https://nvidia.github.io/libnvidia-container/stable/debian10/$(ARCH) /
2deb [signed-by=/usr/share/keyrings/libnvidia-container-keyring.asc] https://nvidia.github.io/nvidia-container-runtime/stable/debian10/$(ARCH) /
1apt install nvidia-container-runtime
There's a typo in the configuration file /etc/nvidia-container-runtime/config.toml shipped with that package:
It should read ldconfig = "/sbin/ldconfig" instead of ,
so remove the superfluous ldconfig = "@/sbin/ldconfig"@.
1docker run --rm --gpus all nvidia/cuda:11.4.3-base-ubuntu20.04 nvidia-smi
Upgrading Nvidia drivers with backports
Unfortunately, trying to run an nvidia/cuda:11.4.2 container, errors out for me:
1# docker run --rm --gpus all nvidia/cuda:11.4.2-devel-ubuntu20.04 nvidia-smi
2docker: Error response from daemon: OCI runtime create failed: container_linux.go:380: starting container process caused: process_linux.go:545: container init caused: Running hook #0:: error running hook: exit status 1, stdout: , stderr: nvidia-container-cli: requirement error: unsatisfied condition: cuda>=11.4, please update your driver to a newer version, or use an earlier cuda container: unknown.
And indeed, as can be seen by the nvidia-smi output run on the host system, my CUDA-version is 11.2.
The Nvidia driver that is shipped in DEbian/non-free "bullseye" is 460.91.03,
which - according to the
NVIDIA CUDA Toolkit Release Notes (Table.3) -
supports CUDA 11.2.0.
So it seems we have to upgrade our Nvidia drivers first. Luckily, "bullseye-backports" contains 470.103.01, which should give us CUDA 11.4 Update 4.
The main backports repository is configured in /etc/apt/sources.list.d/backports.list:
1deb http://deb.debian.org/debian/ bullseye-backports main
2deb-src http://deb.debian.org/debian/ bullseye-backports main
However, what we really need is the non-free backports repository, as this is where the Nvidia drivers are kept.
I enable that in /etc/apt/sources.list.d/backports-nonfree.list:
1deb http://deb.debian.org/debian/ bullseye-backports contrib non-free
2deb-src http://deb.debian.org/debian/ bullseye-backports contrib non-free
Because there is a whole suite of interdependent Nvidia packages (that require exact versions), we need to upgrade all of them to the version from backports:
1apt install -t bullseye-backports nvidia-smi
In order to test the new drivers, we need to unload the kernel-modules first. Manually unload and load the modules (or simply reboot the machine):
1rmmod nvidia_uvm nvidia_drm nvidia_modeset nvidia
2modprobe nvidia_uvm nvidia_drm nvidia_modeset nvidia
And finally we can use the GPU in the Docker container:
1$ docker run --rm --gpus all nvidia/cuda:11.4.3-base-ubuntu20.04 nvidia-smi
2Fri Feb 11 10:45:00 2022
3+-----------------------------------------------------------------------------+
4| NVIDIA-SMI 470.103.01 Driver Version: 470.103.01 CUDA Version: 11.4 |
5|-------------------------------+----------------------+----------------------+
6| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
7| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
8| | | MIG M. |
9|===============================+======================+======================|
10| 0 NVIDIA GeForce ... Off | 00000000:68:00.0 Off | N/A |
11| 30% 59C P0 84W / 320W | 0MiB / 10015MiB | 0% Default |
12| | | N/A |
13+-------------------------------+----------------------+----------------------+
14
15+-----------------------------------------------------------------------------+
16| Processes: |
17| GPU GI CI PID Type Process name GPU Memory |
18| ID ID Usage |
19|=============================================================================|
20| No running processes found |
21+-----------------------------------------------------------------------------+
🎉
Finishing up
Adding Users
Our users are managed via LDAP, but the CUDA workstation is a bit of an island. Only very few people need access to it, but I want them to be able to authenticate with their one true password.
So I'm setting up the system to use local users, but configure it to attempt password verification via LDAP.
This boils down to an /etc/nsswitch.conf file:
1passwd: files systemd
2group: files systemd
3shadow: files ldap
4gshadow: files
It's also important that our users have the correct permissions to access the GPU hardware
and are able to start/stop docker containers.
For this we make sure that they are members of a few default groups, by adding the following to
/etc/adduser.conf:
1EXTRA_GROUPS="video users docker"
2ADD_EXTRA_GROUPS=1
This way I have proper control over who can access the machine. I just need to create a new local user with the same username as their LDAP-account, but with disabled local password:
1adduser --disabled-password SOMEUSER
SOMEUSER will then be able to login with their LDAP-password.
Notifying users of reboots
Every now and again, I'm going to do system upgrades that require reboots.
Now the typical workflow of our researches is to log into the machine, start up a docker container that performs some meachine learning task, and then leave the machine. After a couple of days (or weeks) they come back to check the results.
However, when the machine gets rebooted most docker containers do not re-start automatically.
To enable auto-restarting docker containers, they have to be spawned with the --restart unless-stopped flag:
1docker run --restart unless-stopped ...
But people tend to forget these things, so I would like to notify them that the machine has rebooted and their jobs have stopped. After a reboot I don't really know who had a running job before the reboot, so I'll just send an email to all users - or at least those who are authenticated via our LDAP server:
1#!/bin/sh
2
3body() {
4 cat <<EOF
5Hallo,
6
7Der CUDA-Rechner "$(hostname)" wurde soeben neu gestartet.
8Solltest du einen Job laufen gehabt haben, wurde dieser
9beendet, und du musst ihn evtl. neu starten.
10
11The CUDA-host "$(hostname)" has just been rebooted.
12Any running jobs of yours have been terminated, and you probably
13have to restart them manually.
14
15
16Cheers.
17The automated reboot notifier,
18on behalf of noc@iem.at
19EOF
20}
21
22
23now=$(date +"%Y-%m-%d %H:%M")
24domail() {
25 body | mail --subject="$(hostname) reboot @${now}" --append="Reply-To: IEM Network Operation Center <noc@iem.at>" "$1"
26}
27
28
29for x in /home/*/; do
30 x=${x%/}
31 x=${x##*/}
32 if getent -s ldap shadow "${x}" >/dev/null; then
33 domail "${x}@iem.at"
34 fi
35done
This script is run by systemd at startup via the following
service definition in /etc/systemd/system/mail4reboot.service:
1[Unit]
2Description=Notify all LDAP users about a reboot
3ConditionFileIsExecutable=/usr/local/bin/mail4reboot
4
5After=syslog.target network.target
6
7[Service]
8Type=oneshot
9ExecStart=/usr/local/bin/mail4reboot
10
11[Install]
12WantedBy=multi-user.target
After enabling the service with systemd enable mail4reboot
people get pesky mails whenever the machine recovered from going down...
People will eventually finish their projects and leave the institute. We will think about stopping those annoying mails for them when this happens.
Setting a Message-of-the-Day
Our machines are named after deceased influencers in our field (computermusic).
This machine was named after Alvin Lucier (best known for his mindblowing
"I Am Sitting in a Room").
To get our users a little bit of context (and startup help),
we add an /etc/motd file that is displayed whenever you login to the machine.
1
2The programs included with the Debian GNU/Linux system are free software;
3the exact distribution terms for each program are described in the
4individual files in /usr/share/doc/*/copyright.
5
6Debian GNU/Linux comes with ABSOLUTELY NO WARRANTY, to the extent
7permitted by applicable law.
8
9
10%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
11
12 Alvin Lucier (1931-2021)
13
14 Alvin Lucier was an American composer of experimental music and sound
15 installations that explore acoustic phenomena and auditory perception.
16 Much of his work is influenced by science and explores the physical properties
17 of sound itself: resonance of spaces, phase interference between closely tuned
18 pitches, and the transmission of sound through physical media.
19
20
21 Welcome to lucier.iemnet, a Docker/CUDA workstation.
22
23 To check whether the Docker/CUDA setup is working for you, try to run:
24
25
26 docker run --rm --gpus all nvidia/cuda:11.4.3-base-ubuntu20.04 nvidia-smi
27
28%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
Final words
That's it. Happy machine learning.