Fundamentals of I/O benchmarking

Measure for Measure


In most cases, you will not be planning systems on a greenfield site but will want to replace or optimize an existing system. The requirements are typically not formulated in technical terms (e.g., latency < 1ms, 2,000 IOPS) but reflect the user's perspective (e.g., make it faster).

Because the storage system is already in use, in this case, you can determine its load profile. Excellent tools are built into the operating system for this: iostat and blktrace. Iostat has already been introduced. The blktrace command not only reports the average size of I/O operations, it displays each operation separately with its size. Listing 6 contains an excerpt from the output generated by:

Listing 6

Excerpt from the Output of blktrace

8,0    0       22     0.000440106     0  C  WM 23230472 + 16 [0]
8,0    0        0     0.000443398     0  m   N cfq591A  / complete rqnoidle 0
8,0    0       23     0.000445173     0  C  WM 23230528 + 32 [0]
8,0    0        0     0.000447324     0  m   N cfq591A  / complete rqnoidle 0
8,0    0        0     0.000447822     0  m   N cfq schedule dispatch
8,0    0       24     3.376123574   351  A   W 10048816 + 8 <- (8,2) 7783728
8,0    0       25     3.376126942   351  Q   W 10048816 + 8 [btrfs-submit-1]
8,0    0       26     3.376136493   351  G   W 10048816 + 8 [btrfs-submit-1]
blktrace -d /dev/sda -o - | blkparse -i -

Values annotated with a plus sign (+) give you the block size of the operation: in this example 16, 32, and 8 sectors. This corresponds to 8192, 16384, and 4096 bytes, which allows you to determined which block sizes are really relevant – and the storage performance for these values can be measured immediately using iops. Blktrace shows much more, because it listens on the Linux I/O layer. Interesting here is that the opportunity to view the requested I/O block sizes and the distribution of read and write access.

The strace command shows whether the workload is requesting asynchronous or synchronous I/O. Async I/O is based on the principle that a request for a block of data is sent and that the program continues working before the answer has arrived. When the data block then arrives, the consuming application receives a signal and can respond. As a result, the application is less sensitive to high storage latency and depends more on the throughput.


I/O benchmarks have three main goals: Identifying the optimal settings to ensure the best possible application performance, evaluating storage to be able to purchase or lease the best storage environment, and identifying performance bottlenecks.

Evaluation of storage systems – whether in the cloud or in the data center – should never be isolated from the needs of the application. Depending on the application, storage optimized for throughput, latency, IOPS, or CPU should be chosen. Databases require low latency for their logfiles, and many IOPS for the data files. File servers need high throughput. For other applications, the requirements can be determined with programs like iostat, strace, and blktrace.

Once the requirements are known, the exact characteristics of the indicators can be tested using benchmarks. The performance of the I/O stack with or without the filesystem will be of interest, depending on the question. The iops and iometer tools bypass the filesystem.

Concentrating on the same metrics helps you investigate performance bottlenecks. Often, you will see a correlation between high latency, long request queues (measurable with iostat), and negative user perception. Even if the throughput is far from exhausted, IOPS can cause a bottleneck that slows down storage performance.

The write and read caches work in opposite directions. The caching effect causes performance measurements for the read cache first to increase within a few cycles and drop after a short time for write caches. Many caches work with read-ahead for read operations. The effects of I/O elevator scheduling are added for write operations. Neighboring operations are merged in the cache for both reads and writes. Random I/O requests reduce the probability that adjacent blocks are affected – and thus affect access performance. Random I/O operations are at a disadvantage on magnetic hard disks, more so than SSDs.

If you keep an eye on the key figures and the influencing factors, you turn to benchmark tools. In this article, we presented two storage benchmark tools in the form of iometer and iops. For pure filesystem benchmarks, however, fio, bonnie, or iozone are fine. Hdparm is useful for measuring the physical starting situation.

The blk-trace, strace, and iostat tools are useful for monitoring. Iostat gives a good overview of all the indicators, whereas blktrace listens in on individual operations. Strace can indicate whether async I/O is involved, which shifts the focus away from latency to throughput.

In the benchmark world, it's important to know that complexity causes many pitfalls. Only one thing helps: repeatedly questioning the plausibility of the measurement results.

The Author

Thorsten Stärk always wanted to be an inventor, and he is pretty close to this objective today. He works as a SAP solutions engineer for VCE, an EMC company. His tasks include creating reference implementations and appliances, as well as support and performance measurements.

Benjamin Schweizer is the author of iops. He loves troubleshooting and performance analysis and always enjoys immersing himself in the program code of software and operating systems. Professionally, he is a principal IT engineer in Fujitsu's IT outsourcing division.

Buy this article as PDF

Express-Checkout as PDF
Price $2.95
(incl. VAT)

Buy ADMIN Magazine

Get it on Google Play

US / Canada

Get it on Google Play

UK / Australia

Related content

  • Review: Accelerator card by OCZ for ESX server
    I/O throughput is a classic bottleneck, especially in virtualized environments. A flash cache card with matching software from OCZ promises to open up wide. We tested it.
  • The Benefit of Hybrid Drives
    People still use hard disks even when SSDs are much faster and more robust. One reason is the price; another is the lower capacity of flash storage. Hybrid drives promise to be as fast as SSDs while offering as much capacity as hard drives. But can they keep that promise?
  • SDS configuration and performance
    Software-defined storage promises centrally managed, heterogeneous storage with built-in redundancy. We examine how complicated it is to set up the necessary distributed filesystems. A benchmark shows which operations each system is best at tackling.
  • TKperf – Customized performance testing for SSDs and HDDs
    SSD manufacturers try to impress customers with performance data. If you want to know more, why not try your own performance measurements with a standardized test suite that the free TKperf tool implements.
  • Tuning SSD RAID for optimal performance
    Hardware RAID controllers are optimized for the I/O characteristics of hard disks; however, the different characteristics of SSDs require optimized RAID controllers and RAID settings.
comments powered by Disqus