Scalable mail storage with Dovecot and Amazon S3

Storage Space

Ceph Basics

These concerns, however, do not mean that you need to refrain entirely from the convenience of S3 storage in Dovecot. Because the S3 protocol is publicly documented, several projects that provide S3 storage exist on a FLOSS basis, including the shooting star of the storage environment, Ceph.

Ceph has received much publicity in recent months, especially because of its sale to Red Hat. Thus, Ceph is a familiar concept for most admins, and it can be seen as an object store with various front ends. The ability to provide multiple front ends is quite a distinction compared with other object stores such as OpenStack Swift. Ceph was designed by its creator Inktank as a universal store for almost everything that happens in a modern data center.

A Ceph cluster ideally consists of at least three machines. Various Ceph components, including at least one monitoring server per host (MON), and storage daemons (object storage daemons, or OSDs) for each existing hard drive then run on these machines.

The monitoring servers are the guards within the storage architecture: They monitor the quorum using the Paxos algorithm to avoid split brains. Generally, a cluster partition is only considered quorate if it contains at least 50 percent of the MONs plus a whole MON. Consequently, in a three-node cluster, a cluster partition is quorate if it sees two MONs – Ceph would automatically switch off a partition that only sees one MON.


Furthermore, the MONs act as the directory for the Ceph cluster: Clients actually talk directly to the hard drives in the cluster (i.e., the OSDs). However, if the clients want to talk to the OSDs, they need to know how to reach them. The MONs export dynamic lists containing the existing OSDs and the existing MONs (OSD and MON maps) and serve up both maps if clients ask for the information. On the basis of the Crush algorithm, clients can then calculate the correct position of binary objects themselves. Ceph does not, in this sense, have a central directory where the target disks are recorded for each individual binary object.

Parallelism is proving to be a greater advantage of Ceph; it is inherent to almost all Ceph services. Individual clients who want to store a 16MB file in Ceph usually divide it into four blocks of 4MB. They then upload all four files onto four OSDs in the cluster at the same time, leveraging the combined write speed of four hard drives. The more spindles there are in a cluster, the more processes the Ceph cluster can deal with simultaneously.

This is an important prerequisite for use in the S3 example: Even mail systems that are exposed to increased loads can easily store various email in Ceph in the background at the same time. Ceph usually only performs badly compared with conventional storage solutions when it comes to sequential write latencies; however, this is irrelevant for the Dovecot S3 example.

Ceph Front Ends

The most attractive object store is useless if the clients cannot communicate with it directly. Ceph provides multiple options for clients to contact it. The RADOS block device emulates a normal hard drive based on Linux. What appears to be locally installed on the client computers is, in fact, a virtual block device. Writes to this block device migrate directly to Ceph in the background.

Ceph FS is a POSIX-compatible filesystem: however, it is Inktank's eternal problem child, which has stubbornly remained unfinished for years. The Ceph Object Gateway, however, is really interesting for the S3 example shown in Figures 3, 4, and 5. The construct previously called RADOS Gateway is based on Librados, which allows direct and native access to Ceph objects. The RADOS Gateway, on the other hand, exposes RESTful APIs that either follow the syntax of Amazon S3 or OpenStack Swift.

Figure 3: An existing Ceph cluster can be expanded quickly to a Ceph Object Gateway using an entry like this.
Figure 4: Ceph Object Gateway needs a Fast CGI-enabled web browser for it to work. Apache with mod_fastcgi is the typical combination.
Figure 5: The proof: The Ceph Object Gateway lifts its head when a URL is called, indicating that no buckets are stored for the anonymous user.

The Ceph Object Gateway might not implement the S3 specification completely, but the main features can be found in the gateway. This completes the solution: A cluster with at least three nodes operates on local Ceph disks. Additionally, a server controls the Ceph Object Gateway and allows various instances of Dovecot with the S3 plugin to store email.

Such a solution scales on all levels: If the Ceph cluster needs more space, you can just use more computers. If the strain on the Dovecot server becomes too high, you can also use more computers. As long as there is enough space, this principle can be expanded with practically no limits.

Buy this article as PDF

Express-Checkout as PDF
Price $2.95
(incl. VAT)

Buy ADMIN Magazine

Get it on Google Play

US / Canada

Get it on Google Play

UK / Australia

Related content

comments powered by Disqus
Subscribe to our ADMIN Newsletters
Subscribe to our Linux Newsletters
Find Linux and Open Source Jobs

Support Our Work

ADMIN content is made possible with support from readers like you. Please consider contributing when you've found an article to be beneficial.

Learn More”>


		<div class=