Photo by Nadjib BR on Unsplash

Photo by Nadjib BR on Unsplash

A ptrace-based tracing mechanism for syscalls

Hidden Treasures

Article from ADMIN 71/2022
By , By , By
The libiotrace library monitors running, static and dynamically linked programs and collects detailed data for many file-I/O-related function calls.

Scientific applications can be limited by file I/O because of their distributed nature and implementation of storage. The libiotrace library, extended with ptrace-based syscall tracing, lets users and developers analyze file-I/O-based bottlenecks.


Dynamic profiling analyzes a program during runtime, thereby gathering information about resource utilization. Often, you want to know which part of a program causes the highest CPU utilization or which task uses how much memory. This information can be used either to optimize the program or to manage a running system and prevent bottlenecks. You can find a multitude of tools for gathering such data.

If the source code of the program is available, you can use a debugger or an instrumentation framework to gather resource usage information. Most compilers offer an instrumenting framework that allows you to insert profiling functionality during compile time. For example the GNU compiler collection (GCC) offers support for gprof [1], which can be used to analyze the time spent in each part of a program.

If you don't have the source code or don't want to recompile the program in question, you can use the LD_PRELOAD environment variable. Launching a program causes the executable to be loaded into memory and executed as a new process. All the dynamic libraries (so-called shared objects) on which the program depends are loaded into the process. Once all libraries are part of the process memory, the linker collects all exposed functions from the libraries and links them against function calls. If a function is provided by more than one library, the first match is used for linking.

The environment variable LD_PRELOAD allows you to load and link a library before any other library is loaded, so you can use it to get the program to call your own provided implementation of a library function. The libiotrace implementation of the function gathers information (e.g., execution time of the function and function parameters) and writes the collected data to a buffer. It also calls the function that would have been called without LD_PRELOAD by resolving the address of the function with dlsym, which then returns the second match from all loaded libraries. The set of tools in gperftools [2] uses LD_PRELOAD, as well.

libiotrace: Just Another Profiling Tool?

A typical CPU computes data faster than data can be fetched from or stored to main memory (the so-called memory wall) [3]. Storage only exacerbates this problem. File I/O can therefore be a highly relevant factor for program optimization. The libiotrace [4] library uses LD_PRELOAD to gather data about POSIX [5] and MPI [6] file I/O functions. Although other tools, such as Darshan [7], use this method too, libiotrace adds live tracing support, as opposed to the pure postmortem analysis of most tools.

A typical tracing setup for libiotrace is shown in Figure 1, which illustrates how LD_PRELOAD is used to "insert" libiotrace between the profiled program and libc .

Figure 1: The libiotrace tracing setup.

Typically, the data is either written to a logfile, sent to an InfluxDB instance, or both. InfluxDB serves as a data source for Grafana for near real-time visualization.

Supplemental Profiling with ptrace

During work on libiotrace, a few instances of file I/O couldn't be traced by wrapping the file I/O functions with LD_PRELOAD. This file I/O uses function calls that are obviously not dynamically linked against the libiotrace wrappers during the start of the executable. Tracing this file I/O required further investigation.

Running the program with strace revealed that the file I/O in question does in fact call the POSIX function open64. The same behavior could be observed in a debugger. A closer look at the function call in the stack trace of the debugger showed the root cause: The function call wasn't going through the procedure linkage table (PLT), which is generated during compile time to enable relocations of function addresses. Each entry in the PLT is a stub function that is called instead of the function itself (located in a shared object).

During runtime, the dynamic linker searches the function address in the shared object and provides it to the stub function in the PLT. The stub function then uses this address to call the function in the shared object file. A PLT entry is thus a layer of indirection used to relocate function calls in a single place per loaded object. A disassembly of the used library proved that open64 had no PLT entry in this library. An example of this situation can be found in the implementation of dlopen in libdl (the linker itself).

The dlopen function can be used by a program to load a dynamic library. In fact, this function is used by the linker itself to load and link shared object files. To open and read a shared object file, dlopen calls an implementation of open or open64 (on a 64-bit system), which are both part of the libc library. A call to dlopen can happen before the dynamically linked libc is available (e.g., during the loading process of libc ).

Therefore, a relocation of open or open64 during a call of dlopen is not feasible, so the dlopen implementation ships with its own statically linked version of the open functions. On further research, even more libraries with statically linked versions of POSIX file I/O functions were found (e.g., some versions of the libpthread library).

To make matters worse, other ways can prevent linking against a function loaded by LD_PRELOAD. For example, the linker options -Bsymbolic and -Bsymbolic-functions can ensure that a call to a function will use a local function inside a shared library and not a function exposed from a different object. Furthermore, flags for dlopen (RTLD_NOW and RTLD_DEEPBIND) can change the order in which dynamic libraries are searched for functions to link.

In conclusion, LD_PRELOAD is not sufficient if you want to profile all file I/O. Enter syscall tracing.

Buy this article as PDF

Express-Checkout as PDF
Price $2.95
(incl. VAT)

Buy ADMIN Magazine

Get it on Google Play

US / Canada

Get it on Google Play

UK / Australia

Related content

comments powered by Disqus
Subscribe to our ADMIN Newsletters
Subscribe to our Linux Newsletters
Find Linux and Open Source Jobs

Support Our Work

ADMIN content is made possible with support from readers like you. Please consider contributing when you've found an article to be beneficial.

Learn More”>


		<div class=