Custom Containers (Apptainer)

Building Custom Containers

While many packages can be installed in a Conda environment, not everything can easily installed that way. Sometimes you want a system level package instead, or you need a library or tool that isn’t available to Conda but is present in the system repositories (or externally). Or perhaps you have a pre-built binary that runs on a different Linux distribution than the one running on the cluster. This is where building your own container can be utilized.

You have two options for creating your container, extend one of the pre-built containers just customizing it as needed, or build your own from scratch. Extending requires the least effort, but will not fit all use cases. Creating a container from scratch offers the most flexibility, but can also take a long time to troubleshoot build and run-time issues – it’s not for the faint of heart!

Extending Existing Containers

It’s possible to build an Apptainer container from a node in the cluster, which comes in handy when you just want to extend the functionality of an existing container. A container consists of layers that can be stacked on top of each other, you can easily extend either the default or remote desktop container with your customizations. The downside to this layering approach is that the containers can get quite large, and take a while to build.

The following example extends the minimal container, which is available pre-built in the cluster. The .def file that this container was built with is presented below in the section Building Containers From Scratch, as an example of building your own container from scratch.

Basic Apptainer definition file that extends the minimal environment, cowsay.def
 1Bootstrap: localimage
 2From: /cluster/share/singularity/minimal.sif
 3
 4%post
 5  # Update the DNF repository cache so we can install packages
 6  dnf --refresh makecache && \
 7
 8  # Install the cowsay package
 9  dnf install -y cowsay && \
10
11  # Cleanup the cache to save space before we compress the image
12  dnf -y clean all && \
13  rm -rf /var/cache/dnf /var/cache/yum
14
15%labels
16  Author Zach McGrew <mcgrewz@wwu.edu>
17  Version v0.0.1
18
19%help
20  This container is designed to demonstrate extending the default
21  JupyterHub container.

Save the above script to a directory where you have plenty of space to work, as you will be generating a large file (This is example will use about 475MB) when the build completes. Building the image is done with the apptainer build command. For additional information about the build sub-command, please see the full Apptainer documentation.

Note

The example build session below forces the build process ( -F ). If you don’t force the build you will get warning messages about the help and label sections. By forcing the build, the labels ( apptainer inspect ) and the run-time help message ( apptainer run-help ) are correctly set.

Warning

The “Verifying bootstrap image” section of the build can take multiple minutes to pull the container image from the file server and verify that it’s correct. If you extend the remote-desktop.sif, or remote-desktop-gpu.sif it can take upwards of five (5) minutes for this stage of the build alone.

The final compression stage of “Creating SIF file…” can also take a very long time. It’s best to request as many CPUs as you can in your HTCondor job to help with this, so it can utilize as many cores as possible. Requesting at least 8 CPUs is a good starting point to build the example here in less than 10 minutes.

USER@c-X-X:~/apptainer_jupyterhub_extend$ apptainer build -F cowsay.sif cowsay.def
INFO:    User not listed in /etc/subuid, trying root-mapped namespace
INFO:    The %post section will be run under fakeroot
INFO:    Starting build...
INFO:    Verifying bootstrap image /cluster/share/singularity/minimal.sif
INFO:    Running post scriptlet
+ dnf --refresh makecache
Rocky Linux 8 - AppStream                       2.4 MB/s |  11 MB     00:04
Rocky Linux 8 - BaseOS                          983 kB/s | 7.1 MB     00:07
Rocky Linux 8 - Extras                           42 kB/s |  14 kB     00:00
Rocky Linux 8 - PowerTools                      1.5 MB/s | 2.7 MB     00:01
Extra Packages for Enterprise Linux 8 - x86_64  4.5 MB/s |  16 MB     00:03
Metadata cache created.
+ dnf install -y cowsay
Last metadata expiration check: 0:00:06 ago on Fri Mar  1 22:45:29 2024.
Dependencies resolved.
================================================================================
 Package          Architecture     Version                 Repository      Size
================================================================================
Installing:
 cowsay           noarch           3.7.0-10.el8            epel            47 k

Transaction Summary
================================================================================
Install  1 Package

Total download size: 47 k
Installed size: 72 k
Downloading Packages:
cowsay-3.7.0-10.el8.noarch.rpm                   57 kB/s |  47 kB     00:00
--------------------------------------------------------------------------------
Total                                            39 kB/s |  47 kB     00:01
Running transaction check
Transaction check succeeded.
Running transaction test
Transaction test succeeded.
Running transaction
  Preparing        :                                                        1/1
  Installing       : cowsay-3.7.0-10.el8.noarch                             1/1
  Running scriptlet: cowsay-3.7.0-10.el8.noarch                             1/1
  Verifying        : cowsay-3.7.0-10.el8.noarch                             1/1

Installed:
  cowsay-3.7.0-10.el8.noarch

Complete!
+ dnf -y clean all
46 files removed
+ rm -rf /var/cache/dnf /var/cache/yum
INFO:    Adding help info
INFO:    Adding labels
INFO:    Creating SIF file...
INFO:    Build complete: cowsay.sif

As shown above, there is a newly created cowsay.sif file in the directory you’re currently in. Make note of this directory, it will be specifying it as the custom location when starting the JupyterLab server via JupyterHub. An easy way to get this path is to use the pwd command to print the working directory. See the section Using Custom Containers for running below.

Cowsay running in the new custom container

Building Containers From Scratch

Building a container from scratch requires more effort, but allows for maximum customization. It can also be difficult to troubleshoot. Be prepared to spend lots of time reading through the Jupyter error messages logged in your ~/.jupyterhub.condor.err and ~/.jupyterhub.condor.out after attempting to start the Jupyter server.

There are a handful of packages, and a few directories that must be created in order to run a custom container. These are detailed in the note below.

Note

These packages are required for JupyterLab to start via HTCondor:

  • jupyterhub >= 4.1.4

  • notebook >= 4

  • jupyterlab >= 4

  • nodejs

  • batchspawner >= 1.3

To get access to the normal cluster file server and local scratch space you should pre-create the following directories to ensure they can be properly mounted inside the container when it starts:

  • /cluster

  • /scratch

  • /scratch_memory_backed

What follows is the full source for minimal.def, which was used to build minimal.sif that was extended in the example above. This is also the base container for the Default environment, which is then extended again via the Remote Desktop environments. If you would like to see the source for those files, please contact us via the Support page.

The minimal.def file below has comments on each section of code that it’s executing. Strictly speaking, not every one of the packages that are installed are required, but they make the container much more functional, without making the size grow too much.

Minimal Apptainer definition file (used for extending above), minimal.def
  1Bootstrap: docker
  2From: rockylinux:8.9.20231119
  3
  4%files
  5  conda2kern /usr/bin/
  6
  7%post
  8  # Variabless to make updates easier
  9  CONDA_VERSION="py311_24.1.2-0"
 10  PANDOC_VERSION="3.1.11.1"
 11
 12  # Ensure UTF-8 for build time
 13  export LC_ALL="C.UTF-8"
 14  export LANG="C.UTF-8"
 15
 16  # Base update, various commmon repos
 17  dnf -y --refresh upgrade && \
 18  dnf -y install epel-release && \
 19  dnf --enablerepo=epel group && \
 20  dnf config-manager --set-enabled powertools && \
 21  dnf config-manager --set-enabled appstream && \
 22  dnf -y --refresh upgrade && \
 23
 24  # Basic packages that are nice to have
 25  dnf -y install \
 26      tar \
 27      xz \
 28      gzip \
 29      bzip2 \
 30      git \
 31      zip \
 32      unzip \
 33      wget \
 34      findutils \
 35      which \
 36      bash-completion \
 37      less \
 38      htop \
 39      man-db \
 40      man-pages \
 41      tmux && \
 42
 43  # Setup directories for our cluster environment
 44  mkdir /cluster /scratch /scratch_memory_backed && \
 45
 46  # pandoc (From GitHub -- Not currently available in repos)
 47  curl -qsLLo /tmp/pandoc.tar.gz https://github.com/jgm/pandoc/releases/download/${PANDOC_VERSION}/pandoc-${PANDOC_VERSION}-linux-amd64.tar.gz && \
 48  tar -C /usr/ -xzf /tmp/pandoc.tar.gz pandoc-${PANDOC_VERSION}/bin && \
 49  rm /tmp/pandoc.tar.gz && \
 50
 51  # Miniconda
 52  curl -qsLLo /tmp/miniconda.sh "https://repo.anaconda.com/miniconda/Miniconda3-${CONDA_VERSION}-Linux-x86_64.sh" && \
 53  bash /tmp/miniconda.sh -b -p /opt/miniconda && \
 54  rm /tmp/miniconda.sh && \
 55  export PATH="/opt/miniconda/bin:${PATH}" && \
 56  # conda update -n base -c defaults conda && \
 57  # conda update --all && \
 58
 59  # Conda channels
 60  conda config --add channels conda-forge && \
 61  conda config --remove channels defaults && \
 62  conda config --set channel_priority strict && \
 63
 64  # Update Conda
 65  conda update -n base -c conda-forge conda
 66
 67  # Prefer the libmamba solver (try to take everything from conda-forge!)
 68  conda install -y -c conda-forge conda-libmamba-solver && \
 69  conda config --set solver libmamba && \
 70
 71  # Update Conda first
 72  conda update --all && \
 73
 74  # Conda Jupyter packages
 75  conda install -y -c conda-forge jupyterhub && \
 76  conda install -y -c conda-forge notebook && \
 77  conda install -y -c conda-forge jupyterlab && \
 78  conda install -y -c conda-forge jupyter-resource-usage && \
 79  conda install -y -c conda-forge nbgitpuller && \
 80  conda install -y -c conda-forge ipywidgets && \
 81  conda install -y -c conda-forge batchspawner && \
 82  conda install -y -c conda-forge nodejs && \
 83  conda install -y -c conda-forge python-lsp-server && \
 84
 85  # Pip packages
 86
 87  # Final cleanup
 88  python3 -m pip cache purge && \
 89  conda clean -ity && \
 90  dnf -y clean all && \
 91  rm -rf /var/cache/dnf /var/cache/yum && \
 92
 93  # Fix $PATH -- $SINGULARITYENV_PATH overrwrites ALL of our settings without this
 94  echo 'export PATH="/opt/miniconda/bin:${PATH}:/bin:/sbin:/usr/bin:/usr/sbin:/usr/local/bin:/usr/local/sbin"' >> /.singularity.d/env/999-zzz.sh && \
 95  chmod 755 /.singularity.d/env/999-zzz.sh && \
 96
 97  # Fix $PS1 prompt
 98  echo 'export PS1="<${APPTAINER_NAME}>[\u@\h \W]\$ "' >> /.singularity.d/env/999-zzz.sh && \
 99  chmod 755 /.singularity.d/env/999-zzz.sh
100
101%environment
102  export PATH="/opt/miniconda/bin:${PATH}:/usr/local/bin"
103  export LC_ALL="C.UTF-8"
104  export LANG="C.UTF-8"
105  export SING_USER_DEFINED_PREPEND_PATH="/opt/miniconda/bin"
106
107%labels
108  Author Zach McGrew <mcgrewz@wwu.edu>
109  Version v0.0.1
110
111%help
112  This container is designed to support the JupyterLab environment.

In the above minimal.def there is a local file included, conda2kern. This script is explained in the Custom Environments page, but is also included here for those curious as to how the script works. Feel free to include this in your container to continue making new environments easier.

The conda2kern file referenced in minimal.def
  1#!/bin/sh
  2
  3while getopts 'd:hn:pr' opt ; do
  4    case $opt in
  5      h|\?)
  6        echo "Usage: $0 [-d <display name>] [-n <kernel name>] [-h]"
  7        echo "-h:                Display this help message"
  8        echo "-d <display name>: The display name"
  9        echo "-n <kernel name>:  The kernel name"
 10        echo "-p:                Force Python environment"
 11        echo "-r:                Force R environment"
 12        exit 0
 13        ;;
 14      d)
 15        dname="$OPTARG"
 16        ;;
 17      n)
 18        name="$OPTARG"
 19        ;;
 20      p)
 21        force_env="python"
 22        ;;
 23      r)
 24        force_env="r"
 25        ;;
 26    esac
 27done
 28
 29if [ -z "$name" ] ; then
 30  name="$CONDA_DEFAULT_ENV"
 31fi
 32
 33if [ -z "$dname" ] ; then
 34  dname="$CONDA_DEFAULT_ENV"
 35fi
 36
 37KERNEL=~/.local/share/jupyter/kernels/"${name}"/kernel.json
 38
 39if [ ! -e "${KERNEL}" ] ; then
 40  case $force_env in
 41    python)
 42      # Check if ipykernel is installed yet, of we need to install it
 43      if ! conda list ipykernel | awk '$1 ~ /ipykernel/ {found=1} END {exit !found;}' ; then
 44        conda install -c conda-forge -y ipykernel
 45      fi
 46      ;;
 47    r)
 48      # Check if r-irkernel is installed yet, of we need to install it
 49      if ! conda list r-irkernel | awk '$1 ~ /r-irkernel/ {found=1} END {exit !found;}' ; then
 50        conda install -c conda-forge -y r-irkernel r-essentials
 51      fi
 52      ;;
 53    *)
 54      # Environment not forced, try and detect it
 55      if conda list r-essentials | awk '$1 ~ /r-base/ {found=1} END {exit !found;}' ; then
 56        force_env="r"
 57      elif conda list ipykernel | awk '$1 ~ /ipykernel/ {found=1} END {exit !found;}' ; then
 58        force_env="python"
 59      else
 60        echo "r-essentials package not installed, and ipykernel package is not installed."
 61        echo "Unable to determine which kernel type to install."
 62        echo "Please re-run this script with either -r or -p to specify which one you want."
 63        exit 1
 64      fi
 65      ;;
 66  esac
 67
 68  # Actually install the kernel now
 69  case $force_env in
 70    python)
 71      python -m ipykernel install --user --name "$name" --display-name "$dname" || exit 1
 72      ;;
 73    r)
 74      Rscript -e "IRkernel::installspec(name = '$name', displayname = '$dname')" || exit 1
 75      ;;
 76    *)
 77      echo "There was an unhandled error trying to install the kernel."
 78      echo "You should not see this message on your terminal. Ever."
 79      exit 1
 80      ;;
 81  esac
 82fi
 83
 84# Should the environment be fixed? Or will it be added?
 85! grep -q '"env":' "${KERNEL}"
 86fixEnv="$?"
 87
 88# Insert or modify the env section to include the needed CONDA_* and PATH variables
 89awk -v fixEnv="$fixEnv" -v CONDA_PREFIX="$CONDA_PREFIX" -v CONDA_DEFAULT_ENV="$CONDA_DEFAULT_ENV" -v PATH="$PATH" '
 90
 91function conda_prefix() { print "  \"CONDA_PREFIX\": \"" CONDA_PREFIX "\","; }
 92
 93function conda_default_env() { print "  \"CONDA_DEFAULT_ENV\": \"" CONDA_DEFAULT_ENV "\","; }
 94
 95function path() {
 96  #PATH is special. Prepend our new bin directory, cleanup anything else...
 97  split(PATH, spath, /:/);
 98  npath=CONDA_PREFIX "/bin";
 99  seenp[npath] = 1;
100  for (p in spath) {
101    if (! (spath[p] in seenp)) {
102      seenp[spath[p]] = 1;
103      npath = npath ":" spath[p];
104    }
105  }
106  print "  \"PATH\": \"" npath "\","
107}
108
109# If we find an existing ENV section, only replace some variables
110/"env": \{/ { inEnv = 1; print "Fixing existing env section..." | "cat 1>&2"; }
111
112# If in the env section, only replace these lines
113inEnv && /CONDA_PREFIX/ { conda_prefix(); next; }
114inEnv && /CONDA_DEFAULT_ENV/ { conda_default_env(); next; }
115inEnv && /PATH/ { path(); next; }
116inEnv && /},/ { inEnv = 0; }
117
118# Normally we print every line we see
119{print $0;}
120
121# Not in the env section, but found a closing array section
122# Build the env section here
123!fixEnv && /\],/ {
124  print "Inserting custom env section..." | "cat 1>&2";
125  print " \"env\": {";
126  conda_prefix();
127  conda_default_env();
128  path();
129  print " },"
130  #No longer in the environment section
131  inEnv = 0;
132}
133' "$KERNEL" > "${KERNEL}.swp"
134
135# https://unix.stackexchange.com/questions/485004/remove-trailing-commas-from-invalid-json-to-make-it-valid
136# Modified to preserve whitespace before '}'
137sed -i.bak ':begin;$!N;s/,\n\(\s*\)}/\n\1}/g;tbegin;P;D' "${KERNEL}.swp"
138
139mv "${KERNEL}" "${KERNEL}.BAK"
140mv "${KERNEL}.swp" "${KERNEL}"

Using Custom Containers

In order to use your custom environment, you need to do set two options:

  1. Select “custom” for the environment drop down choice

    Custom environment selected in the environment dropdown
  2. Specify the path to the container to run

    +SingularityImage specified in the Extra HTCondor options field

On the Server Options page the section Extra HTCondor options at the bottom allows you to specify additional options to be passed to the HTCondor scheduler. We can use this field to specify that we want to run a custom image instead of the default.

+SingularityImage="/cluster/research-groups/researcher_lastname/containers/my_cool_container.sif"

This allows you to specify which image to start on the execute point in the cluster. Make sure it has the required packages mentioned above in the Building Containers From Scratch section. If you extended the minimal or one of the other pre-built containers the required packages will already be installed.