Photo by Brenton Pearce on Unsplash

Photo by Brenton Pearce on Unsplash

Realistic Ceph Performance Tuning Measures

Turbocharged

Article from ADMIN 93/2026
By
Administrators regularly criticize Ceph performance tuning, because some elements can be tweaked effectively, while others are difficult or impossible to fine tune. We look at how to measure performance, compare actual requirements with what is technically possible, and determine realistic objectives.

Modern, highly scalable storage is a must-have for today's data centers. Scalable platforms such as OpenStack and Kubernetes have become the norm, so administrators regularly have to deal with the question of how to design their storage to keep pace with compute power growth.

Traditional network attached storage (NAS) and storage area network (SAN) appliances are difficult in this regard: Once they fill up, you can only add more storage by purchasing another appliance, which can be costly, creates another point of administration, and does not make your daily work any easier.

The situation is different with Ceph, because the manufacturer explicitly promises that the platform will scale more or less indefinitely. According to the official documentation, the limit is several hundred million storage drives; anyone who has to manage that many is dealing with other problems, anyway.

Even with Ceph, not all that glitters is gold. Admins have always complained about its performance. The good news is that virtually all of Ceph's potential performance tweaks can be leveraged – to varying degrees, admittedly. In this article, I use a small Ceph installation as a practical example to explain what to look for in terms of performance and what challenges you can expect.

Robust but Slow

In addition to its high scalability, Ceph and its core, the RADOS object storage system, have other compelling arguments in their favor. The technology is available under a free license and is completely open source. Ceph has also earned a reputation for being virtually indestructible. Even stopping an object storage daemon (OSD) with SIGSTOP (i.e., a daemon that releases a drive for use in Ceph), deleting its data, and continuing the process with SIGCONT does not lead to data corruption in the cluster. Instead, the OSD in question simply outputs a warning before shutting itself


...

Use one of the options below to read the full article

Buy this article as PDF

Download Article PDF now with Express Checkout
Price $2.95
(incl. VAT)

Buy ADMIN Magazine

Related content

  • Fixing Ceph performance problems
    Ceph is powerful and efficient, but wrong settings or faulty hardware can cause the decentralized object store to stumble.
  • Ceph Dashboard at a Glance
    To hardcore command-line enthusiasts, the Ceph Dashboard might sound like a frowned-upon point-and-click gadget, but the advantages simply cannot be ignored; so much so, that Ceph Dashboard has established a firm following in the Ceph universe.
  • What's new in Ceph
    Ceph and its core component RADOS have recently undergone a number of technical and organizational changes. We take a closer look at the benefits that the move to containers, the new setup, and other feature improvements offer.
  • Optimally combine Kubernetes and Ceph with Rook
    Ceph distributed storage and Kubernetes container orchestration come together with Rook.
  • Getting Ready for the New Ceph Object Store

    The Ceph object store remains a project in transition: The developers announced a new GUI, a new storage back end, and CephFS stability in the just released Ceph v10.2.x, Jewel.

comments powered by Disqus