Photo by Brenton Pearce on Unsplash

Photo by Brenton Pearce on Unsplash

Realistic Ceph Performance Tuning Measures

Turbocharged

Article from ADMIN 93/2026
By
Administrators regularly criticize Ceph performance tuning, because some elements can be tweaked effectively, while others are difficult or impossible to fine tune. We look at how to measure performance, compare actual requirements with what is technically possible, and determine realistic objectives.

Modern, highly scalable storage is a must-have for today's data centers. Scalable platforms such as OpenStack and Kubernetes have become the norm, so administrators regularly have to deal with the question of how to design their storage to keep pace with compute power growth.

Traditional network attached storage (NAS) and storage area network (SAN) appliances are difficult in this regard: Once they fill up, you can only add more storage by purchasing another appliance, which can be costly, creates another point of administration, and does not make your daily work any easier.

The situation is different with Ceph, because the manufacturer explicitly promises that the platform will scale more or less indefinitely. According to the official documentation, the limit is several hundred million storage drives; anyone who has to manage that many is dealing with other problems, anyway.

Even with Ceph, not all that glitters is gold. Admins have always complained about its performance. The good news is that virtually all of Ceph's potential performance tweaks can be leveraged – to varying degrees, admittedly. In this article, I use a small Ceph installation as a practical example to explain what to look for in terms of performance and what challenges you can expect.

Robust but Slow

In addition to its high scalability, Ceph and its core, the RADOS object storage system, have other compelling arguments in their favor. The technology is available under a free license and is completely open source. Ceph has also earned a reputation for being virtually indestructible. Even stopping an object storage daemon (OSD) with SIGSTOP (i.e., a daemon that releases a drive for use in Ceph), deleting its data, and continuing the process with SIGCONT does not lead to data corruption in the cluster. Instead, the OSD in question simply outputs a warning before shutting itself


...

Use one of the options below to read the full article

Buy this article as PDF

Download Article PDF now with Express Checkout
Price $2.95
(incl. VAT)

Buy ADMIN Magazine

Related content

comments powered by Disqus