Remora provides per-node and per-job resource utilization data that can be used to understand how an application performs on the system through a combination of profiling and system monitoring.

Remora – Resource Monitoring for Users

Jeff Layton

While chatting with some colleagues, we discussed how some applications return unusual results, even though previous runs had produced the expected results. As they were discussing ways to tell whether applications were running correctly or incorrectly, I thought it would be great to get a snapshot of what was happening on all of the nodes involved in the job. I like to think of this as “application telemetry.”

A simple search on “telemetry” brings up a definition like, “… the process of recording and transmitting the readings of an instrument.” In this case, the instrument is the high-performance computing (HPC) system, and the readings are resource aspects of the system (e.g., CPU, memory, network, and storage usage, etc.). Here, the telemetry is used to help the user understand what their application was doing during execution. The key word in that last sentence is user. The case in point: The user has access to resource usage for their application to spot problems, prompting me to revisit Remora.

REMORA: REsource MOnitoring for Remote Applications, from the University of Texas Advanced Computing Center (TACC), combines monitoring and profiling to provide information about your application. Not strictly a profiler and not strictly a monitoring tool in the traditional sense of monitoring the entire cluster, Remora provides per-node and per-job resource utilization data that can be used to understand how an application performs on the system. The user (not just the admin) can go back and examine what was happening on systems while a job was running.

The goal of Remora is simplicity, which is achieved by using commonly installed tools that focus on the user, putting data and possibly information in the user’s hands (and probably the admin’s if an issue crops up). The data can also be used by admins in a collective way to understand how the system is being used.

Overview

Before diving into the ins and outs of Remora, keep in mind two things: (1) It is focused on the user, and (2) it is neither a profiling tool nor a monitoring tool. In essence, Remora is more of a higher level usage reporting tool that tells what resources your job used along with some associated details. It does not dive deep – that's more the function of a profiler – and it doesn’t produce fine-grained system monitoring details; rather, it focuses on information that the user can use to understand whether a job seems to perform correctly and, if not, information that can start the “debugging” process. System admins have access to the same information, so they too can examine the job if the user feels something went wrong.

Remora collects several streams of information with simple userspace tools:

Memory usage, including CPUs, Xeon Phi, and Nvidia GPUs
CPU utilization
I/O usage – Lustre, data virtualization service (DVS)
Nonuniform memory access (NUMA) properties
Network topology
Message passing interface (MPI) communication statistics (currently you have to use Intel MPI or MVAPICH2)
Power consumption
CPU temperatures
Detailed application timing

To capture all this information, Remora uses SSH to connect to all the nodes used in the application by spawning a background task on each of the nodes and regularly capturing the data. However, I/O data is only captured on the master node of the application.

No Remora-specific applications are used to gather the information. Rather, existing applications are used along with information parsed from the /proc/ table. A partial list of the tools and data sources used are:

numastat
mpstat (one of my personal favorites)
nvidia-smi
ibtracert
ibstatus
xltop
mpip
python
/proc/meminfo
/proc/[pid]/status
/proc/sys/lnet/stats
/sys/class/infiniband …

Remora uses these tools and sources to collect information at a specific interval while the application runs. Because Remora runs in user space, it only collects information associated with the application and can’t gather escalated privileged information. In the case of MPI applications, it grabs the hostnode list of environment variables and uses that for SSHing into the nodes for data gathering.

As of version 1.8.4, Remora requires either Intel MPI or MVAPICH2. The profiling library mpiP, is used in conjunction with one of these two MPI libraries to gather MPI profile stats.

When Remora is finished, it creates a directory in the form remora_XXX in the directory in which the application runs. Subdirectories contain the raw data, and you’ll find an HTML page you can open to examine and plot the data (this is REALLY amazing!).

Remora collects data from as many of the sources possible. For example, if it detects that Lustre is installed, it will grab data for that. If it detects the presence of an InfiniBand network, it will collect data for that. If it doesn't detect something, it won’t try to gather data for it, and you won’t be able to create a chart for the data.

The sources for which it attempts to gather information is controlled by a configuration file. On my test system the path to the file is /home/<user>/bin/remora-1.8.4/bin/config/modules. You can edit that file to remove resources you do not want gathered. When installed, that file contains all the sources of information. In the default list below, each line comprises the name of the module and the directory in which the metric belongs.

cpu,CPU
memory,MEMORY
numa,NUMA
dvs,IO
lustre,IO
lnet,IO
ib,NETWORK
gpu,MEMORY
network,NETWORK
power,POWER
temperature,TEMPERATURE
eth,NETWORK

For example, if you do not have an InfiniBand network, you remove it from the list and Remora won't attempt to gather that information.

For this article, my config file is:

cpu,CPU
memory,MEMORY
eth,NETWORK

Installing Remora

Installation is not difficult. The approach is slightly different from the usual ./configure; make; make install, and you need to be aware that because Remora can provide MPI statistics, you need to build it with the intended version of MPI (i.e., do not cross the MPIs). Of course, you don't have to use MPI tools, and Remora will just continue with the configuration and installation.

For this article, I built Remora into my home account with the command:

REMORA_INSTALL_PREFIX=/home/laytonjb/bin/remora-1.8.4 ./install.sh

You can install it in a common directory if all users are to have access. (Note that a previous article on Remora originally had a typo (now corrected) in the installation command, in which the letter P was missing in PREFIX.)

If you use multiple versions of MPI, you need to build Remora for each version. If you are using environment modules (e.g., Lmod), you can easily write an environment module for Remora so that it is added to the user environment when the corresponding MPI module is loaded.

1 2 3 Next »

Articles

News

Vendors

Whitepapers

Write for Us

About Us

Remora – Resource Monitoring for Users

Overview

Installing Remora