Building Containers with HPC Container Maker

HPCCM – HPC Container Maker

Nvidia is developing an open source tool that can create a container spec file for Docker or Singularity. The tool, HPC Container Maker (HPCCM), pronounced “h-p-see-um,” is an open source project that tries to ease the burden of building container applications that use combinations of compilers, libraries, and other tools. It starts with a base OS container and allows you to add other tools and packages or build from source by writing a simple Python script that is then processed to create a spec file for Docker or Singularity. HPCCM has some notable features:

  • collects and codifies best practices
  • makes recipe file creation easy, repeatable, and modular
  • becomes a reference and a vehicle to drive collaboration
  • is container implementation-neutral

Rather than create yet another spec language, HPCCM relies on Python code for the “recipe” of the container you want to build, regardless of the target container type. From the HPCCM documentation, “A recipe consists of one or more stages. A basic recipe will contain a single stage. Stages are the same concept as Docker multistage builds.” The recipe has the steps you want to take in your container build, all written in Python. You can take advantage of the Python language within the recipe by creating variables and using if/elif/else statements, loops, functions, or almost anything else in the Python language.

Within HPCCM are parameterized building blocks that provide specialization within the recipe. The currently available building blocks are:

  • apt-get
  • cmake
  • fftw
  • gnu
  • hdf5
  • mkl
  • mlnx_ofed
  • mvapich2
  • mvapich2_gdr
  • ofed
  • openmpi
  • packages
  • pgi
  • python
  • yum

These building blocks add more functionality to the recipe. The following building block is a quick example:

apt_get(ospackages=['gcc', 'g++', 'gfortran'])

This building block allows packages to be added to the build with the use of apt-get. Be careful not to mix package managers and operating systems (i.e., apt-get with Ubuntu and yum with CentOS). You can find a deeper explanation of the building blocks in the HPCCM documentation.

If building blocks can’t help you build everything you want in your container, you can use templates, which are abstractions for common operations such as downloading files, configuring and building source packages (e.g., using autotools), and working with archives. Typically, templates are used by building blocks and not directly in recipes, but they can be used as needed. The current list of templates include:

  • ConfigureMake
  • git
  • sed
  • tar
  • toolchain
  • wget

Templates are less of an abstraction than building blocks.

An even lower level of abstraction from templates is “primitives,” which are a low-level implementation of specific container instructions, so you can fine-tune your container. The current list of primitives include:

  • baseimage
  • blob
  • comment
  • copy
  • environment
  • raw
  • shell
  • workdir

Primitives typically do not map to all container types, which means that, currently, primitives are tied to a specific container type. You can read about primitives on the HPC Container Maker GitHub site.

Installing HPCCM is fairly simple. The command

# pip install hpccm

uses the pip Python package management system to install HPPCM. For other Python package management tools, you will have to build and install HPCCM yourself (it’s not difficult).

Examples

To better understand HPCCM, I’ll look at a couple of examples. Because I’m using my CentOS 7.5 laptop, these examples won’t involve any GPUs.

Example 1

The first example is very simple: just a base OS along with the GCC compilers (GCC, G++, and GFortran). The HPCCM recipe is basically trivial for this example:

$ more basic.py
"""This example demonstrates recipe basics.
 
Usage:
# hpccm.py --recipe recipes/examples/basic.py --format docker
# hpccm.py --recipe recipes/examples/basic.py --format singularity
"""
 
# Choose a base image
Stage0.baseimage('ubuntu:16.04')
 
# Install GNU compilers (upstream)
Stage0 += apt_get(ospackages=['gcc', 'g++', 'gfortran'])

The recipe only uses a single stage, because it really only builds an Ubuntu 16.04 container with the GCC compilers. The first executable command,

Stage0.baseimage('ubuntu:16.04')

starts with a base image as part of stage 0. In this case, it is Ubuntu 16.04. (It is interesting to test an Ubuntu container on a CentOS host.) The next executable command adds “+= to the stage 0 build:

Stage0 += apt_get(ospackages=['gcc', 'g++', 'gfortran'])

Specifically, it adds the GCC compilers that come with Ubuntu 16.04. You can add other packages or use other building blocks in this stage, or you could create a new stage (e.g., Stage1) to perform other directives to create your container. (Note that Singularity doesn’t currently understand stages, only Docker.) A common use for multiple stages is to build an application from source in the first stage and then copy the resulting binaries and run-time libraries into the second stage to reduce the overall container image size, making it easier to redistribute.

Notice that the recipe says nothing specific about the container target. It could be used to target Docker or Singularity, depending on the target you select when you create the container spec file.

After the recipe is created, the next step is to generate the container spec file. For this example, the target is a Singularity container:

$ hpccm --recipe recipes/examples/basic.py --format singularity > Singularity

In this case, the output of HPCCM, the container spec file, is sent to the file Singularity. If you don’t specify an output file, HPCCM just sends the output to stdout. The resulting Singularity spec file is:

BootStrap: docker
From: ubuntu:16.04
 
%post
    apt-get update -y
    apt-get install -y --no-install-recommends \
        gcc \
        g++ \
        gfortran
    rm -rf /var/lib/apt/lists/*

With Docker as the HPCCM target, you use the following HPCCM command:

hpccm --recipe recipes/examples/basic.py --format docker > Docker

The resulting spec file is:

FROM ubuntu:16.04 AS stage0
 
RUN apt-get update -y && \
    apt-get install -y --no-install-recommends \
        gcc \
        g++ \
        gfortran && \
    rm -rf /var/lib/apt/lists/*

When running HPCCM, be sure the hpccm command is in $PATH, or you will have to give the full path to the command. You can edit the container spec file produced by HPCCM if you want, but remember that any changes you make will not be reflected in the HPCCM recipe.

After creating the container spec file, you simply need to build the container. For Singularity 2.5.1, the version of Singularity tested, the command

# singularity build basic.simg Singularity

will create a container named basic.simg.

To be sure everything worked correctly, I’ll open a shell in the container and see if it is actually Ubuntu 16.04 (remember, the host is a CentOS 7.5 system):

# singularity shell basic.simg
Singularity: Invoking an interactive shell within container...
 
Singularity basic.simg:~> cat /etc/lsb-release
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=16.04
DISTRIB_CODENAME=xenial
DISTRIB_DESCRIPTION="Ubuntu 16.04.4 LTS"
Singularity basic.simg:~> which gcc
/usr/bin/gcc
Singularity basic.simg:~> gfortran -v
Using built-in specs.
COLLECT_GCC=gfortran
COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/5/lto-wrapper
Target: x86_64-linux-gnu
Configured with: ../src/configure -v 
    --with-pkgversion='Ubuntu 5.4.0-6ubuntu1~16.04.9' 
    --with-bugurl=file:///usr/share/doc/gcc-5/README.Bugs 
    --enable-languages=c,ada,c++,java,go,d,fortran,objc,obj-c++ 
    --prefix=/usr --program-suffix=-5 --enable-shared 
    --enable-linker-build-id --libexecdir=/usr/lib 
    --without-included-gettext --enable-threads=posix 
    --libdir=/usr/lib --enable-nls --with-sysroot=/ 
    --enable-clocale=gnu --enable-libstdcxx-debug 
    --enable-libstdcxx-time=yes --with-default-libstdcxx-abi=new 
    --enable-gnu-unique-object --disable-vtable-verify 
    --enable-libmpx --enable-plugin --with-system-zlib 
    --disable-browser-plugin --enable-java-awt=gtk --enable-gtk-cairo 
    --with-java-home=/usr/lib/jvm/java-1.5.0-gcj-5-amd64/jre --enable-java-home 
    --with-jvm-root-dir=/usr/lib/jvm/java-1.5.0-gcj-5-amd64 
    --with-jvm-jar-dir=/usr/lib/jvm-exports/java-1.5.0-gcj-5-amd64 
    --with-arch-directory=amd64 
    --with-ecj-jar=/usr/share/java/eclipse-ecj.jar 
    --enable-objc-gc --enable-multiarch --disable-werror 
    --with-arch-32=i686 --with-abi=m64 
    --with-multilib-list=m32,m64,mx32 --enable-multilib 
    --with-tune=generic --enable-checking=release 
    --build=x86_64-linux-gnu --host=x86_64-linux-gnu 
    --target=x86_64-linux-gnu
Thread model: posix
gcc version 5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.9)

It looks successful. Now I can build on this success by creating a more involved container.

Example 2

The next example builds on the previous one but adds Open MPI to the build. This recipe is important because it makes you think about how the container is built and affects the HPCCM recipe:

"""This example demonstrates recipe basics.
 
Usage:
$ hpccm.py --recipe test2.py --format docker
# hpccm.py --recipe test2.py --format singularity
"""
 
# Choose a base image
Stage0.baseimage('ubuntu:16.04')
 
ospackages = ['make', 'wget', 'bzip2', 'tar']
Stage0 += apt_get(ospackages=ospackages)
 
# Install GNU compilers (upstream)
Stage0 += apt_get(ospackages=['gcc', 'g++', 'gfortran'])
 
Stage0 += openmpi(cuda=False, infiniband=False,
                  prefix='/usr/local/openmpi', version='3.1.0')

Notice that the Open MPI building block is used in this recipe. For this example, it is not used with CUDA or InfiniBand. To build it, I’ve chosen the GCC compilers. (Note that the higher level gnu building block is used instead of apt_get.)

However, the base OS image might not contain the tools needed to download and build Open MPI. To get these tools, they need to be added to the OS with the ospackages line, which defines the packages, and the line that follows, which is a building block to add them to the container.

To process the recipe and create the Singularity spec file, the following command was used:

hpccm --recipe ./test2.py --format singularity > Singularity2

For the curious, the resulting Singularity spec file follows:

BootStrap: docker
From: ubuntu:16.04
 
%post
    apt-get update -y
    apt-get install -y --no-install-recommends \
        make \
        wget \
        bzip2 \
        tar
    rm -rf /var/lib/apt/lists/*
 
%post
    apt-get update -y
    apt-get install -y --no-install-recommends \
        gcc \
        g++ \
        gfortran
    rm -rf /var/lib/apt/lists/*
 
# OpenMPI version 3.1.0
%post
    apt-get update -y
    apt-get install -y --no-install-recommends \
        file \
        hwloc \
        openssh-client \
        wget
    rm -rf /var/lib/apt/lists/*
%post
    mkdir -p /tmp && wget -q --no-check-certificate 
        -P /tmp https://www.open-mpi.org/software/ompi/v3.1/
         downloads/openmpi-3.1.0.tar.bz2
    tar -x -f /tmp/openmpi-3.1.0.tar.bz2 -C /tmp -j
    cd /tmp/openmpi-3.1.0 && ./configure 
        --prefix=/usr/local/openmpi --disable-getpwuid 
        --enable-orterun-prefix-by-default 
        --without-cuda --without-verbs
    make -j4
    make -j4 install
    rm -rf /tmp/openmpi-3.1.0.tar.bz2 /tmp/openmpi-3.1.0
%environment
    export LD_LIBRARY_PATH=
        /usr/local/openmpi/lib:$LD_LIBRARY_PATH
    export PATH=/usr/local/openmpi/bin:$PATH
%post
    export LD_LIBRARY_PATH=/usr/local/openmpi/lib:$LD_LIBRARY_PATH
    export PATH=/usr/local/openmpi/bin:$PATH

The Singularity command to process the spec file to create the container is:

# singularity build test2.simg Singularity2

Once the container is built, you can get a shell into it using the command:

# singularity shell test2.simg

Once in the shell, you can check that Open MPI is installed and functioning:

Singularity test2.simg:~> ls -s /usr/local/
total 5
1 bin  1 etc  1 games  1 include  1 lib  1 man  1 openmpi  1 sbin  1 share  1 src
Singularity test2.simg:~> /usr/local/openmpi/bin/mpif90 -v
Using built-in specs.
COLLECT_GCC=/usr/bin/gfortran
COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/5/lto-wrapper
Target: x86_64-linux-gnu
Configured with: ../src/configure -v 
    --with-pkgversion='Ubuntu 5.4.0-6ubuntu1~16.04.9' 
    --with-bugurl=file:///usr/share/doc/gcc-5/README.Bugs 
    --enable-languages=c,ada,c++,java,go,d,fortran,objc,obj-c++ 
    --prefix=/usr --program-suffix=-5 --enable-shared 
    --enable-linker-build-id --libexecdir=/usr/lib 
    --without-included-gettext --enable-threads=posix 
    --libdir=/usr/lib --enable-nls --with-sysroot=/ 
    --enable-clocale=gnu --enable-libstdcxx-debug 
    --enable-libstdcxx-time=yes --with-default-libstdcxx-abi=new 
    --enable-gnu-unique-object --disable-vtable-verify 
    --enable-libmpx --enable-plugin --with-system-zlib 
    --disable-browser-plugin --enable-java-awt=gtk 
    --enable-gtk-cairo 
    --with-java-home=/usr/lib/jvm/java-1.5.0-gcj-5-amd64/jre 
    --enable-java-home 
    --with-jvm-root-dir=/usr/lib/jvm/java-1.5.0-gcj-5-amd64 
    --with-jvm-jar-dir=/usr/lib/jvm-exports/java-1.5.0-gcj-5-amd64 
    --with-arch-directory=amd64 
    --with-ecj-jar=/usr/share/java/eclipse-ecj.jar 
    --enable-objc-gc --enable-multiarch --disable-werror 
    --with-arch-32=i686 --with-abi=m64 
    --with-multilib-list=m32,m64,mx32 --enable-multilib 
    --with-tune=generic --enable-checking=release 
    --build=x86_64-linux-gnu --host=x86_64-linux-gnu 
    --target=x86_64-linux-gnu
Thread model: posix
gcc version 5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.9)

As an exercise, try modifying the recipe to build a CentOS 7 container instead of Ubuntu 16.04. (Hint: You just need to change the base image in the recipe file.)

More examples are on the HPCCM GitHub page and in an Nvidia blog post. These examples include recipes for using CUDA and GPU applications.

A talk from the Nvidia GPU Technology Conference (GTC) discusses HPCCM. You can check out a PDF of the slides or a recording of the talk.