Lead Image © arsgera, 123RF.com

Lead Image © arsgera, 123RF.com

Linux Storage Stack

Stacking Up

Article from ADMIN 31/2016
By , By
Abstraction layers are the alpha and omega in the design of complex architectures. The Linux Storage Stack is an excellent example of well-coordinated layers. Access to storage media is abstracted through a unified interface, without sacrificing functionality.

In the storage stack context, the end users are typically normal applications (userspace programs/applications). The first component with which Linux programs interact when processing data is the virtual filesystem (VFS). Only through the VFS is it possible to invoke the same system calls for different filesystems on different media. Using VFS, for example, a file is transparently copied for the user from an ext4 to an ext3 filesystem using the cp command.

The variety of filesystems – block-based, cross-network, pseudo-, and even Filesystems in Userspace (FUSE) – demonstrates the numerous possibilities that VFS encapsulation opens up. The aforementioned system calls are unified functions, such as open, read, or write, no matter what filesystem is hidden underneath. The specific filesystem operations are abstracted by the VFS, and caches – including the directory entry (dentry) cache – speed up file access.

The next layer of the storage stack consists of individual filesystem implementations. They provide the VFS with generic methods and translate them into specific calls for accessing the device. The filesystem also performs its primary task – organizing data and metadata for an underlying storage medium. The Linux kernel also speeds up access to these media with a caching mechanism – the Linux page cache.

Block I/O

Flexible block I/O structures (BIOs) are used instead of pages for administration in the kernel. The structures represent block I/O operations or queries that the kernel is currently executing (in-flight BIOs). This applies both to I/O on the page cache and to direct I/O (i.e., access that bypasses the page cache). The advantages of BIOs is in handling multiple segments involved in the current I/O operation.

BIOs consist of a list or a vector of segments that points to different pages in memory. This results in a dynamic allocation of pages to the sectors on the block device via block I/O. BIOs come up again with respect to the block layer. BIOs are grouped into requests before the kernel passes I/O operations to the driver's dispatch queue.

Stacked Block Devices

An important component of the Linux Storage Stack is located in front of the block layer, where logical block devices are implemented. They may operate with conventional devices, but they also perform additional functions, such as layering logical devices on top of each other (i.e., stacked devices). The Logical Volume Manager (LVM) and software RAID (mdraid) are probably the best known representatives. For example, they allow you to scale logical volumes beyond the limitation of physical disks or to store data redundantly. Stacking is often used in RAID and LVM combination disks. As you can see, you have to go through several steps before getting to the block layer.

Block Layer

The block layer processes the BIOs described above and is responsible for forwarding application I/O requests to the storage devices. It provides the applications with uniform interfaces for data access, whether this involves a disk, an SSD, Fibre Channel (FC) SAN, or other block-based storage device. The block layer also provides a uniform access point to all applications for memory devices and their drivers. In this way, it conceals the complexity and diversity of storage devices from applications.

The block layer also ensures an equitable distribution of I/O access, appropriate error handling, statistics, and a scheduling system. The latter is particularly intended to improve performance and to protect users from performance disadvantages caused by poorly implemented applications or drivers.

Depending on the respective device drivers and configuration, the data can take three paths in or around the block layer:

  1. Via one of the three traditional I/O schedulers: NOOP, Deadline, or CFQ. The formerly fourth traditional I/O scheduler (i.e., the anticipatory scheduler, or AS) was similar to CFQ and was therefore removed in kernel version 2.6.33.
  2. Via the Linux multiqueue block I/O queuing mechanism (blk-mq ). Introduced with Linux kernel 3.13, blk-mq is relatively new. The Linux kernel developers mainly developed it for high-performance flash memory, such as PCIe SSDs, to break through the IOPS (I/O operations per second) performance limitations of traditional I/O schedulers.
  3. Diverting around the block layer directly to BIO-based drivers. In the period before blk-mq, manufacturers of PCIe SSDs developed these elaborate drivers, which have to implement a lot of the block layer tasks themselves.

Buy this article as PDF

Express-Checkout as PDF
Price $2.95
(incl. VAT)

Buy ADMIN Magazine

SINGLE ISSUES
 
SUBSCRIPTIONS
 
TABLET & SMARTPHONE APPS
Get it on Google Play

US / Canada

Get it on Google Play

UK / Australia

Related content

  • NVDIMM Persistent Memory

    Non-volatile dual in-line memory modules will provide storage as fast as RAM and keep its content through a reboot. The Linux kernel is already geared to handle the new technology and can even serve the modules up as block devices.

  • NVDIMM and the Linux kernel
    Non-volatile dual in-line memory modules will provide storage as fast as RAM and keep its content through a reboot. The Linux kernel is already geared to handle the new technology and can even serve the modules up as block devices.
  • Fundamentals of I/O benchmarking
    Admins often want to know how to measure the performance of a specific solution. Care is needed, however: Where there are benchmarks, there are pitfalls.
  • Tuning SSD RAID for optimal performance
    Hardware RAID controllers are optimized for the I/O characteristics of hard disks; however, the different characteristics of SSDs require optimized RAID controllers and RAID settings.
  • SUSE Linux Enterprise 11 SP3 tested
    SUSE has released the third update of its enterprise distribution. Along with updated software and bug fixes, Novell has added new drivers, storage and networking improvements, and support for UEFI secure boot to the current release.
comments powered by Disqus