New Release of Lmod Environment Modules System

Lmod is an indispensable tool for high-performance computing. With the new release of version 6, now is a good time to review Lmod and look at its new capabilities.

One of the key tools for any cluster is environment modules, which allow you to define your user environment and the set of tools you need or want to build your application. The module feeds into a resource manager (job scheduler), where you can re-create the same environment that you used to build the application to run the application.

One environment module, Lmod, is under constant development and has some unique features. I've written about Lmod before, but recently a new version 6.0 was announced that has some new tools that make it worth reviewing.

Fundamentals of Environment Modules

Programmers use a number of compilers, libraries, MPI libraries/tools, and other tools to write applications. For example, someone might code with OpenACC, targeting GPUs and Fortran, whereas another person might use PETSc to solve their problem. Tools that allow users and developers to specify the set of tools they want or need is key to operating an effective HPC system. “Effective” can mean better performance (choosing the tools that allow your code to run as fast as possible), more flexibility (user choice of tools that match their specific case), or ease of configuration of the environment for specific tools.

For example, assume you have three versions of the GNU compiler – 4.8, 4.9, and 5.1 – and the latest Intel and PGI compilers, along with the latest MPICH (3.1.4) and OpenMPI (1.8.5). Altogether you have 10 possible combinations (five compilers, two MPI libraries). At this point you have two choices: You can reduce the number of combinations (e.g., get rid of some of the compilers and perhaps one of the MPI libraries), or you can use environment modules so users can choose the combination of compiler and MPI library they want or need.

As a user, I might build a parallel application using GNU 4.8 and MPICH; however, the GNU 5.1 compilers have some unique features, so I might want to try building the same application with it. Environmental modules allow the user to select the tools used for production, while also allowing them to use a different tool set for development.

The secret to environment modules is manipulating the environment variables. Users can manipulate environment variables such as $PATH, $LD_LIBRARY_PATH, and $MANPATH and make changes to these variables according to the tool combinations desired. Changing the tool set changes these environment modules accordingly. It's fairly simple conceptually, but it's not always easy in practice.

Lmod

Lmod is an environment module tool that provides simple commands for manipulating your tool selection. For example, you can list available modules, load and unload modules, purge all modules, swap modules, list loaded and available modules, query modules, ask for help on modules, show modules, and perform many other tasks related to modules. Other options aren't used as frequently but are there if you need them.

One of the coolest features of Lmod is its ability to handle a module hierarchy, so that Lmod will only display modules that are dependent on loaded modules, preventing you from loading incompatible modules. This feature can help reduce unusual errors with mismatched modules that are sometimes very difficult to diagnose. I'll explain more about module hierarchy in a later section, because it is a very important feature in Lmod.

One of the first widely used environment module tools is Environment Modules TCL/C, so-called because the code is written primarily in C and the modules in TCL. Lmod retains the ability to read and use modules written in TCL, but it adds the ability to read and use modules written in Lua, a popular language in its own right and a very embeddable language for applications.

Module files written in either TCL or Lua tell Lmod how to change environment variables for a particular tool. You place these files in a directory hierarchy and add a couple of commands in the module so that Lmod knows the tool dependencies.

Next, I discuss Lmod Hierarchical Modules and explain how to organize module files and how to limit the visibility of dependent module files. I'll use an example from my own cluster to help illustrate this process. Additionally, I've tried to add comments about Lmod best practices, some of which I’ve gathered from email discussions with Dr. McLay on the Lmod-users mailing list and others from Lmod documentation and presentations. I hope these help with your Lmod deployment.

Lmod Hierarchical Modules

One of the key capabilities of Lmod is module hierarchy. With TCL/C modules, you can load pretty much any modules you want, even if they are not compatible, whereas Lmod doesn't allow you to see all possible modules, so you might not be aware of what is available. However, the Lmod module spider command lets you see all modules. Figure 1 illustrates the module hierarchy of the module files for one of my systems.

Figure 1: Example module file layout.

At the top of the diagram, /cluster/modulefiles is the directory for all module files. Below this directory are three main subdirectories: Core, compiler, and mpi. These directories indicate the dependencies of the various modules. For example, everything in the compiler directory depends on a specific compiler (e.g., GCC 4.8). Everything in the mpi directory is dependent on a specific MPI and compiler combination.

By default, Lmod reads module files in /cluster/modulefiles/Core, so a best practice is to put any module files in this directory that do not depend on either a compiler or an MPI. This means you also put the compiler module files in the Core directory.

The gnu subdirectory under Core is where all of the module files for the GNU family of compilers are stored. A best practice from the developer of Lmod, Dr. Robert McLay at the University of Texas Advanced Computing Center (TACC), is to make all subdirectories beneath Core, compiler, and mpi lowercase. In McLay’s own words, “Lmod is designed to be easy to use interactively and be easy to type. So I like lower case names where ever possible.” He continues: “I know of some sites that try very hard to match the case of the software: OpenMPI, PETSc, etc. All I can say is that I’m glad I don’t have to work on those systems.”

In the gnu subdirectory is a module file named 4.8.lua, which in Figure 1 is labeled 4.8 (f), with the (f) meaning that it is a file and not a directory. This file contains the details for version 4.8 of the GNU compilers. The extension .lua, although not shown in Figure 1, indicates that the module file is written in Lua; however, as I mentioned before, it could be written in TCL.

If you had version 4.9 and 5.1 of the GNU compilers, you would have module files named 4.9.lua and 5.1.lua in the /cluster/modulefiles/Core/gnu directory. If you had a different set of compilers (e.g., those from PGI), you would create a directory under /cluster/modulefiles named pgi (i.e., /cluster/modulefiles/Core/pgi)and then place the module files in this subdirectory.

You can find what modules are available in Core with the avail command:

[laytonjb@home4 ~]$ module avail
 
-------------------------- /cluster/modulefiles/Core ---------------------------
   gnu/4.8    lmod/6.0.1    settarg/6.0.1
 
Use "module spider" to find all possible modules.
Use "module keyword key1 key2 ..." to search for all possible modules matching
any of the "keys".

For the example module file layout, you see the compiler module (only one in this example) called gnu/4.8. Because it's the only one available, you can go ahead and load it:

[laytonjb@home4 ~]$ module load gnu/4.8
[laytonjb@home4 ~]$ module list
 
Currently Loaded Modules:
  1) gnu/4.8

The compiler module files modify the Lmod environment variables to point to the MPI libraries so that only the MPI tools that depend on the loaded compiler are available to the user. That is because everything in the mpi subdirectory depends on having a compiler loaded. Now if you type module avail, you get the following response:

[laytonjb@home4 ~]$ module avail
 
-------------------- /cluster/modulefiles/compiler/gnu/4.8 ---------------------
   mpich/3.1    openmpi/1.8
 
-------------------------- /cluster/modulefiles/Core ---------------------------
   gnu/4.8    lmod/6.0.1    settarg/6.0.1
 
Use "module spider" to find all possible modules.
Use "module keyword key1 key2 ..." to search for all possible modules matching
any of the "keys".

Notice that Lmod now lists the two MPI libraries that have been built with the compiler you loaded. Now you can load one of the MPI libraries. For example, load MPICH and then see which MPI scripts are in $PATH.

[laytonjb@home4 ~]$ module load mpich/3.1
[laytonjb@home4 ~]$ module list
 
Currently Loaded Modules:
  1) gnu/4.8   2) mpich/3.1
 
[laytonjb@home4 ~]$ which mpicc
/cluster/mpich-3.1.4-gnu-4.8.2/bin/mpicc
[laytonjb@home4 ~]$ which mpif77
/cluster/mpich-3.1.4-gnu-4.8.2/bin/mpif77

Notice that mpicc and mpif77 point to the correct scripts.

An important key to making everything work is in the module files. To better understand these module files, I’ll take a deeper look at them.

Under the Module File Hood

Everything works just great with Lmod so far. I can load, unload, delete, and purge modules and so on. Lmod handles everything quite well; however, without good module files, Lmod would blindly execute your commands, which could cause problems. To understand what's happening with the module files, I’ll take a look at the gnu/4.8 compiler module:

-- -*- lua -*-
------------------------------------------------------------------------
-- GNU 4.8.2 compilers - gcc, g++, and gfortran. (Version 4.8.2)
------------------------------------------------------------------------
 
 
help(
[[
This module loads the gcc-4.8 compilers (4.8.2). The 
following additional environment variables are defined:
 
CC   (path to gcc compiler wrapper      )
CXX  (path to g++ compiler wrapper      )
F77  (path to gfortran compiler wrapper )
F90  (path to gfortran compiler wrapper )
 
See the man pages for gcc, g++, gfortran (f77, f90). For 
more detailed information on available compiler options and 
command-line syntax.
]])     
 
 
-- Local variables
local version = "4.8"
local base = "/usr/bin/"
 
 
-- Whatis description
whatis("Description: GNU 4.8 compilers (4.8.2)")
whatis("URL: www.gnu.org")
 
 
-- Normally - Take care of $PATH, $LD_LIBRARY_PATH, $MANPATH
-- Don't add "/usr/bin" to path since it will get removed if module
-- is removed
 
 
-- Environment Variables
pushenv("CC", pathJoin(base,"gcc"))
pushenv("FC", pathJoin(base,"gfortran"))
pushenv("CXX", pathJoin(base,"g++"))
pushenv("F90", pathJoin(base,"gfortran"))
pushenv("cc", pathJoin(base,"gcc"))
pushenv("fc", pathJoin(base,"gfortran"))
pushenv("cxx", pathJoin(base,"g++"))
pushenv("f90", pathJoin(base,"gfortran"))
 
 
-- Setup Modulepath for packages built by this compiler
local mroot = os.getenv("MODULEPATH_ROOT")
local mdir = pathJoin(mroot,"compiler/gnu", version)
prepend_path("MODULEPATH", mdir)
 
 
-- Set family for this module
family("compiler")

For this module file (recall that it’s written in Lua), I use the standard GNU compilers that came with the distribution. Therefore they are installed in /usr/bin and /usr/lib – basically the standard $PATH.

The module file can be broken down into several sections. The first part of the file is the help function, which is what is printed to stdout when you ask for help with the module. The next major section is where the environment variables are defined. For this module, the variables are pretty straightforward: CC, cc, fc, FC, and so on.

Notice that I didn't define any changes to $PATH or $LD_LIBRARY_PATH. If I had, then if I were to unload the module, it would erase parts of $PATH, such as /usr/bin. This isn't a good idea, so I didn't modify $PATH, $LD_LIBRARY_PATH, or $MANPATH. This guideline only extends to packages that are installed in the standard $PATH and is not true if the application has been installed somewhere else.

The next to last section, Setup Modulepath for packages built by this compiler, is very important. Here, I define two environment variables for Lmod: $MODULEPATH and $MODULEPATH_ROOT. The line

local mdir = pathJoin(mroot,"compiler/gnu", version)

creates a local variable named mdir, which is a concatenation of the variable mroot ($MODULEPATH_ROOT) and compiler/gnu. This tells Lmod that subsequent module avail commands should look at the compiler/gnu subdirectory under the main modules directory corresponding to the compilers just loaded (gnu/4.8). As the writer of the modules, you control where the module files that depend on the compilers are located. This step is the key to module hierarchy. You can control what modules are subsequently available by manipulating the mdir variable (in this case). This key attribute of Lmod gives you great flexibility.

The very last line in the module file, the statement family("compiler"), although optional, can make life better and easier for users (i.e., a best practice). The function family tells Lmod to which family the module belongs. A user can only have one family loaded at a time. In this case, the family is compiler, so that means no other compilers can be loaded. (You would hope all other compiler modules also use the family statement.) Adding this line helps users prevent self-inflicted problems. Even though the statement is somewhat optional, I highly recommend using it.

If the gnu/4.8 compiler is loaded, then the diagram of the module layout should look something like Figure 2. The green labels indicate the module that is loaded and the path to the modules that depend upon it (the MPI modules). Note that the MPI modules are under the compiler directory because they depend on the compiler module that is loaded.

Figure 2: Active path after gnu/4.8 is loaded.

In the previous section, I loaded the mpich-3.1 module. The listing here is for the mpich/3.1 module that was built with the gnu/4.8 compilers.

-- -*- lua -*-
------------------------------------------------------------------------
-- mpich-3.1 (3.1.4) support. Built with gcc-4.8 (4.8.2)
------------------------------------------------------------------------
 
 
help(
[[
This module loads the mpich-3.1 MPI library built with gcc-4.8.
compilers (4.8.2). It updates the PATH, LD_LIBRARY_PATH, 
and MANPATH environment variables to access the tools for
building MPI applications using MPICH, libraries, and
available man pages, respectively.
 
This was built using the GNU compilers, version 4.8.2.
 
The following additional environment variables are also defined:
 
MPICC   (path to mpicc compiler wrapper   )
MPICXX  (path to mpicxx compiler wrapper  )
MPIF77  (path to mpif77 compiler wrapper  )
MPIF90  (path to mpif90 compiler wrapper  )
MPIFORT (path to mpifort compiler wrapper )
 
See the man pages for mpicc, mpicxx, mpif77, and mpif90. For 
more detailed information on available compiler options and 
command-line syntax. Also see the man pages for mpirun or
mpiexec on executing MPI applications.
]])     
 
 
-- Local variables
local version = "3.1.4"
local base = "/cluster/mpich-3.1.4-gnu-4.8.2"
 
 
-- Whatis description
whatis("Description: MPICH-3.1.4 with GNU 4.8 compilers")
whatis("URL: www.mpich.org")
 
 
-- Take care of $PATH, $LD_LIBRARY_PATH, $MANPATH
prepend_path("PATH", pathJoin(base,"bin"))
prepend_path("PATH", pathJoin(base,"include"))
prepend_path("LD_LIBRARY_PATH", pathJoin(base,"lib"))
prepend_path("MANPATH", pathJoin(base,"share/man"))
 
 
-- Environment Variables
pushenv("MPICC", pathJoin(base,"bin","mpicc"))
pushenv("MPICXX", pathJoin(base,"bin","mpic++"))
pushenv("MPIF90", pathJoin(base,"bin","mpif90"))
pushenv("MPIF77", pathJoin(base,"bin","mpif77"))
pushenv("MPIFORT", pathJoin(base,"bin","mpifort"))
pushenv("mpicc", pathJoin(base,"bin","mpicc"))
pushenv("mpicxx", pathJoin(base,"bin","mpic++"))
pushenv("mpif90", pathJoin(base,"bin","mpif90"))
pushenv("mpif77", pathJoin(base,"bin","mpif77"))
pushenv("mpifort", pathJoin(base,"bin","mpifort"))
 
 
-- Setup Modulepath for packages built by this compiler/mpi
local mroot = os.getenv("MODULEPATH_ROOT")
local mdir = pathJoin(mroot,"mpi/gnu", "4.8","mpich","3.1")
prepend_path("MODULEPATH", mdir)
 
 
-- Set family for this module (mpi)
family("mpi")

If you compare this module file to the compiler module file, you will see many similarities. However, in this module file, the classic environment variables, $PATH, $LD_LIBRARY_PATH, and $MANPATH, are modified. Because you want the MPI tools associated with the module to be “first” in $PATH, the Lmod module command prepend_path is used.

Toward the end of the file, examine the code for the Modulepath. The local variable mdir points to the “new” module subdirectory, which is mpi/gnu/4.8/mpich/3.1. (Technically, the full path is /cluster/modulefiles/mpi/gnu/4.8/mpich/3.1 because $MODULEPATH_ROOT is /cluster/modulefiles.) In this subdirectory, you should place all modules that point to tools that have been built with both the gnu/4.8 compilers and the mpich/3.1 tools. Examples of module files that depend on both a compiler and an MPI tool are applications or libraries such as PETSc.

Also notice that the mpich/3.1 module also uses the family() function so that the user cannot load a second MPI module. You could even have a family() function for libraries such as PETSc.

Module Usage

In an article a couple of years ago, I presented a way to gather logs about TCL/C environment usage. It was a bit of kludge, but it did allow me to gather data about module usage. In version 6.x of Lmod, this ability was brought to the forefront.

Tracking module usage is conceptually fairly easy, but a number of steps are involved. Having this information can be amazingly important, because it allows you to track which tools are used the most. (I associate one tool with one module.) If you have various versions of a specific tool, it allows you to track the usage of each so that you can either deprecate an older version or justify keeping it around and maintaining it. You can also see which modules are used as a function of time, which helps you understand when people run their jobs.

The topic is important enough to warrant its own article, so stay tuned for an upcoming article on gathering module usage information.

Summary

Although I've written about Lmod before, I continue to come back to it because it is so useful. It greatly helps users sort out their environment so that they don't accidentally load conflicting modules. The first time you have to debug a user's code when they have mixed MPI implementations, you will be thankful for Lmod.

Environment modules in general, and Lmod specifically, allow you to keep multiple versions of the same package on a system to service applications that have been built with older versions of a compiler, MPI, or library. I even saw a recent posting to the Open MPI mailing list asking about LAM-MPI, even though it basically has been dead for a decade. You would be surprised how long applications stick around and bring their dependencies with them.

Because Lmod can read TCL module files in addition to Lua (the preferred language), you can move easily from TCL/C Environment Modules to Lmod. As you can see from the Lua module file examples I used here, the syntax is very clean and simple, making them very easy to read.

Finally, Lmod is developing tools to audit module usage. This information is amazingly useful, as pointed out in two articles from Harvard (Scientific Software as a Service Sprawl, parts one and two). The author gives a very good explanation about how to set up Lmod for collecting module usage, putting it into a database, and mining that database – which is very cool stuff indeed.

Look for an upcoming article that goes into more depth on this topic.