Lead Image © J.R. Bale, 123RF.com

Lead Image © J.R. Bale, 123RF.com

Managing Linux Memory

Memory Hogs

Article from ADMIN 21/2014
By , By
Even Linux systems with large amounts of main memory are not protected against bottlenecks and potentially drastic performance degradation because of memory shortage. In this article, we investigate the complex causes and test potential solutions.

Modern software such as large databases run on Linux machines that often provide hundreds of gigabytes of RAM. Systems that need to run the SAP database (HANA) in a production environment, for example, can have up to 4TB of main memory [1]. Given these sizes, you might expect that storage-related bottlenecks no longer play a role, but the experience of users and manufacturers with such software solutions shows that this problem is not yet completely solved and still needs attention.

Even well tuned applications can lose performance because of insufficient memory being available under certain conditions. The standard procedure in such situations – more RAM – sometimes does not solve the problem. In this article, we first describe the problem in more detail, analyze the background, and then test solutions.

Memory and Disk Hogs

Many critical computer systems, such as SAP application servers or databases, primarily require CPU and main memory. Disk access is rare and optimized. Parallel copying of large files should thus have little effect on such applications because they require different resources.

Figure 1 shows, however, that this assumption is not true. The diagram demonstrates how the (synthetic) throughput of the SAP application server changes if disk-based operations also occur in parallel. As a 100 percent reference value, we also performed a test cycle without parallel disk access. In the next test runs, the dd command writes a file of the specified size on the hard disk.

Figure 1: Performance degradation as a function of disk I/O on an SAP application server.

On a system in a stable state, throughput initially is not affected by file operations, but after a certain value (e.g., 16,384MB), performance collapses. As Figure 1 shows, the throughput of the system decreases with increasing file size by nearly 40 percent. Although the figures are likely to be different in normal operation, a significant problem still exists.

Such behavior is often found in daily operation if a backup needs to move data at the same time, or overnight, or generally when large files are copied. A closer examination of these situations shows that paging increases at the same time (Figure 1). Thus, it seems that frequent disk access of active processes that actually need little CPU and memory can under certain circumstances affect the performance of applications that only rarely access the disks.

Memory Anatomy

The degradation of throughput with increasing file size is best understood if you consider Linux-kernel-style main memory management. Linux, like all modern systems, distinguishes between virtual memory, which the operating system and applications see, and physical memory, which is provided by the hardware of the machine, the virtualizer, or both [2] [3]. Both forms of memory are organized into units of equal size. In case of virtual memory, these units are known as pages, whereas physical memory refers to them as frames. On modern systems, they are both still often 4KB in size.

The hardware architecture also determines the maximum size of virtual memory: In a 64-bit architecture, the virtual address space is a maximum of 2^64 bytes in size – even if current implementations on Intel and AMD only support 2^48 bytes [4].

Virtual and Physical

If software – including the Linux kernel itself in this case – wants to access the memory contents, the virtual addresses must be mapped to the addresses in the physical memory. This mapping is realized by process-specific page tables and ultimately corresponds to replacing the page with the content of virtual memory by the page frame in which it is physically stored (Figure 2).

Figure 2: Memory management components on Linux.

While the virtual address space for applications is divided into sections for code, static data, dynamic data (heap), and the stack, the operating system claims larger page frame areas for its internal caches [1]. One of the key caches is the page cache, which is initially just a storage area in which special pages are temporarily stored. These pages include the pages that are involved in file access. If these pages lie in main memory, the system can avoid frequent access to the slow disks. For example, if read() access to a disk area is needed, the Linux kernel first checks to see whether the requested data exists in the page cache. If so, it reads the data there.

Buy this article as PDF

Express-Checkout as PDF
Price $2.95
(incl. VAT)

Buy ADMIN Magazine

SINGLE ISSUES
 
SUBSCRIPTIONS
 
TABLET & SMARTPHONE APPS
Get it on Google Play

US / Canada

Get it on Google Play

UK / Australia

Related content

  • Tuning Your Filesystem’s Cache

    Keeping your key files in RAM reduces latency and makes response time more predictable.

  • Processor and Memory Metrics

    One goal of HPC administration is effective monitoring of clusters. In this article, we talk about writing code that measures processor and memory metrics on each node.

  • The Benefit of Hybrid Drives
    People still use hard disks even when SSDs are much faster and more robust. One reason is the price; another is the lower capacity of flash storage. Hybrid drives promise to be as fast as SSDs while offering as much capacity as hard drives. But can they keep that promise?
  • RAM Revealed

    Virtualized systems are inflationary when it comes to RAM requirements. Storage access is faster when excess RAM is used as a page cache, and having enough RAM helps avoid the dreaded performance killer, swapping. We take a look at the current crop of RAM.

  • Top Top-Like Tools
    Admins solve problems ranging from slow servers to failing applications. The first tool I reach for when I need to check on a server with shell access is Top.
comments powered by Disqus