
Photo by Barnabas Davoti on Unsplash
File access optimization discovery and visualization
Fast and Lean
Analyzing program file I/O is difficult and time consuming, especially if you have to change an existing program with a long history of different developers. Therefore, assistance from tools is important.
Most existing tools (e.g., Darshan [1], Vampir [2], libiotrace [3]) are profilers that collect data during runtime and provide the collected data for later analysis. With the libiotrace profiling data, we developed an automatic analysis tool to find crucial patterns and to offer instructions to improve this pattern for POSIX [4] and MPI [5] file I/O (see the "Basics of File Access" box).
Basics of File Access
Before getting into the details, a good starting point is to describe how file access works internally and how several simultaneous file accesses on one file can interfere with each other.
Opening a file with low-level file access (e.g., in C) creates a file handle with a unique ID representing the opened file. In POSIX, the handles are called file descriptors or streams, whereas in MPI they are called MPI file handles. Because the underlying functionality is the same, the tool presented in this article addresses the different types of handles alike, and we generalize them by calling them active file accesses.
Both MPI and POSIX processes can open multiple files simultaneously, and an active file may have multiple concurrent accesses from a single process (Figure 1). The functions called on these active file accesses become meaningful, but do they interfere with each other, or can they, for
Buy this article as PDF
(incl. VAT)
Buy ADMIN Magazine
Subscribe to our ADMIN Newsletters
Subscribe to our Linux Newsletters
Find Linux and Open Source Jobs
Most Popular
Support Our Work
ADMIN content is made possible with support from readers like you. Please consider contributing when you've found an article to be beneficial.
