Lead Image © Maikloff, Fotolia.com

Lead Image © Maikloff, Fotolia.com

A storage engine for every use case

Data Machine

Article from ADMIN 50/2019
You can choose from among a number of open source database storage engines for both general and specialized data.

MySQL knows very little about how to manage a table. Instead, it delegates such operations to a specific plugin type called the storage engine, which is the implementation of some important functions called by MySQL, including writing, deleting, and updating a row; reading rows; maintaining indexes; accessing indexes to find data more quickly; handling transactions (although many do not do this); and much more.

Today, the concept of a storage engine is implemented by several databases, such as MongoDB and Tarantool, the NoSQL database management system (DBMS) and Lua application server. However, MySQL probably invented this concept, because older competitors (e.g., Oracle, PostgreSQL) never implemented such a thing, knowing only one way to read or write a row.

One of the most famous contributors to MySQL and its fork MariaDB is Facebook. Last year, the social network spread the news of an important migration: Its MySQL instances, which relied on the default storage engine InnoDB, were migrated to their in-house, home-made storage engine, MyRocks, for their RocksDB high-performance database.

Not every company runs databases at the same scale as Facebook, so this piece of news gathered interest from database professionals, system administrators, and developers. Some people realized for the first time that MySQL supports storage engines, and those who already knew realized that MySQL storage engines are something to take extremely seriously. After all, it is difficult to doubt the quality of an important part of the Facebook infrastructure.

Why Storage Engines?

Why would MySQL implement storage engines? Maintaining multiple storage engines is expensive for a development team, even if this simply means fixing the most important bugs. Storage engines need to cooperate when different tables are used in a JOIN or when writes to different tables are performed in the same transaction, bringing a whole class of problems for developers to solve.

The MySQL architecture is complicated by the impossibility of making too many assumptions about what a storage engine can do. For this reason, MySQL has both transaction logs and a binary log. Although some storage engines write logs to implement transactions, or at least some form of crash-safeness, MySQL needs to maintain a centralized binary log to implement some features like replication.

Nevertheless, storage engine architecture is appreciated by many users because not all workloads are equal. In most cases, the default storage engine InnoDB is enough, although most users don't even realize they're using it. Sometimes, though, you need to optimize some aspects of your workload. Maybe you want to reduce the hardware requirements and grant your SSDs a longer life. Maybe you want extra reliability and consistency guarantees, or in another case, you might want to sacrifice reliability to gain more speed. The idea behind storage engines is that you should be able to make such choices without migrating to a different technology.

MySQL comes with some alternative storage engines that can be used in place of InnoDB. Its forks, MariaDB and Percona Server, come with some other storage engines maintained by their teams or by third parties.

Working with Storage Engines

MySQL expects storage engines to implement functions, accept certain parameters, and return a result that it is able to understand; otherwise, they have a high degree of freedom. For example, when MySQL asks the engine to write a row, it could maintain a cache and write data in a transactional way, write the table in-memory instead of to a disk, access a remote table located on another server, run a Perl script, or even do absolutely nothing. Although this might seem to be an unusual way of using database software, it actually opens a wide range of interesting possibilities. I'll mention more about this later when I talk about the least standard storage engines.

The list of available storage engines depends on the MySQL variant, its version, and the way it was installed. Typically, MySQL packages only contain some engines, and others can be installed with dedicated packages. The list can be retrieved with the SHOW ENGINES command. Listing 1 shows the results of this command in a MariaDB 10.3 container from the official MariaDB image.

Listing 1

MariaDB 10.3 Storage Engines

| Engine             | Support | Comment                                                                          | Transactions | XA   | Savepoints |
| MRG_MyISAM         | YES     | Collection of identical MyISAM tables                                            | NO           | NO   | NO         |
| CSV                | YES     | Stores tables as CSV files                                                       | NO           | NO   | NO         |
| MEMORY             | YES     | Hash based, stored in memory, useful for temporary tables                        | NO           | NO   | NO         |
| MyISAM             | YES     | Non-transactional engine with good performance and small data footprint          | NO           | NO   | NO         |
| Aria               | YES     | Crash-safe tables with MyISAM heritage                                           | NO           | NO   | NO         |
| InnoDB             | DEFAULT | Supports transactions, row-level locking, foreign keys and encryption for tables | YES          | YES  | YES        |
| PERFORMANCE_SCHEMA | YES     | Performance Schema                                                               | NO           | NO   | NO         |
| SEQUENCE           | YES     | Generated tables filled with sequential values                                   | YES          | NO   | YES        |

When you create a table, you can specify a storage engine as follows:

CREATE TABLE employee (
  email VARCHAR(100) NOT NULL,
  full_name VARCHAR(200) NOT NULL,
) ENGINE = RocksDB;

To change the storage engine of an existing table, enter:


Creating a table without specifying a storage engine uses the default, which is normally InnoDB:

MySQL [test]> SHOW GLOBAL VARIABLES LIKE 'default\_%storage_engine';
| Variable_name              | Value  |
| default_storage_engine     | InnoDB |
| default_tmp_storage_engine |        |

To change the default, you can modify the default_storage_engine variable. Temporary tables also can use a different default engine with the default_tmp_storage_engine variable.

To use a different storage engine, you usually need to install a package, after which you will see it in the plugin directory. To find the plugin directory, enter:

MySQL [test]> SHOW GLOBAL VARIABLES LIKE 'plugin_dir';
| Variable_name | Value                  |
| plugin_dir    | /usr/lib/mysql/plugin/ |

Storage engines are normally defined in files that start with the ha_ prefix and end with the .so extension (.dll on Windows). To install and uninstall the Sphinx engine, for example, enter:

INSTALL PLUGIN sphinx SONAME 'ha_sphinx.so';

Before installing a storage engine, check its documentation, because it might require additional plugins.

Traditional MySQL Storage Engines

InnoDB is the default storage engine since MySQL 5.5, released in 2010. However, the first InnoDB version was released in 2001, and InnoDB was also the first part of MySQL acquired by Oracle, who made the acquisition in 2005.

From the beginning, InnoDB was important because it implemented foreign keys and implemented transactions more completely. Later, other MySQL engines implemented transactions, but none of them could beat InnoDB performance for the general case.

When InnoDB is configured for maximum reliability, it will not lose any data if a server or system crashes (unless, of course, the storage device is damaged). Its performance is very good for the most common use cases. InnoDB can (1) handle a high load of short transactions, and occasionally long ones, well, although it will behave a bit worse with a high load of long transactions; (2) support full-text searches and GIS data for geographical/geometrical queries; and (3) compress or encrypt tables.

In InnoDB, all rows are stored together with the primary key. Secondary indexes are ordered data structures that contain searchable columns and the corresponding primary key values. This structure has some important performance implications: Searches by primary key are very fast, whereas searches by secondary keys tend to be slower, because they usually imply a search on an index followed by a search on the primary key. Rows are physically sorted by the primary key. Insertions are only fast if the values are inserted in order (lower values first, higher values later). This is not a problem if the primary key is an AUTO_INCREMENT ID, but you should avoid, for example, the use of hashes or UUIDs as primary keys. Because primary key values are contained in secondary indexes, a big primary key implies big indexes. Again, an AUTO_INCREMENT ID is a good choice.

To make sure that InnoDB is fast, you need to make sure it has enough memory. Its buffer pool should contain all frequently accessed data to avoid frequent disk reads. Another aspect to care about is the size of the transaction logs. The official recommendation is to make sure they contain at least one hour of changes, although making it even a bit bigger is usually not a bad idea. If it is too small, InnoDB will need to flush changes to the table files often, which will slow it down.

In Percona Server, InnoDB is called XtraDB , which contains optimizations and bug fixes from Percona.

MyISAM is the MySQL historical default storage engine. It replaced the simpler ISAM in version 3.23, the version that made MySQL famous, and remained the default for some years. Before replacing MyISAM, InnoDB had to prove its stability and reliability. Even then, it took some time to develop full-text and GIS features that were a good alternative to MyISAM, which is why, even today, it is not too surprising to find old applications and old MySQL installations that still make use of many MyISAM tables.

By today's standards, MyISAM is quite primitive. It does not support transactions or foreign keys, and it is not even crash-safe. If MySQL crashes, some MyISAM tables will usually be corrupted. Although you can repair them with the REPAIR command or with the myisamchk utility, changes that were not flushed to disk before the crash are simply lost.

Despite all these problems, MyISAM still has some use cases. Indexes don't contain the primary key, so they are small compared with InnoDB, and MyISAM compresses tables at a very good compression ratio. Compressed MyISAM tables are read-only, but sometimes this is not a problem. The lack of crash safety should also not be a problem if the table is going to be read-only. Although some other storage engines reach very good compression ratios, they don't necessarily support secondary indexes, full text, or GIS.

The MERGE storage engine (aka Mrg_MyISAM ) is described for historic reasons. Its purpose was to merge multiple MyISAM tables into a logical MERGE table that could be accessed with regular SQL queries.

The underlying MyISAM tables were also still writeable individually, which meant it worked around the system's maximum file size limit (although this should not be a problem today), and it distributed the workload between multiple MyISAM tables, with a MERGE table that acted as a larger data container. Nowadays you can achieve the same purpose with views and use any storage engine for the underlying tables, but in many cases, table partitioning is preferable.

The MEMORY storage engine creates in-memory tables. If MySQL is restarted, all data is lost. Transactions are not supported; it only supports hash indexes, which are only used to search an exact value (e.g., WHERE page = 'home'). Because of these characteristics, MEMORY is suitable as a way to cache frequently accessed data, but not much more.

In MySQL versions older than 8.0, as well as in MariaDB, internal temporary tables created to perform a certain operation are built with the MEMORY storage engine when possible – for example, if an ORDER BY cannot be performed by index. However, if the data to sort is too big or if it contains types not supported by MEMORY, a table must be written to disk. For this reason, MariaDB and Percona Server extended MEMORY to support TEXT and BLOB types.

BLACKHOLE is the most simple storage engine. When asked to write data, it does nothing and reports success. When it receives a SELECT, it returns an empty set. Basically, it is the SQL version of the /dev/null file.

Which problem does it solve, then? If you have a master and a slave and an InnoDB table, you can turn it into a BLACKHOLE table only on the slave. When you run INSERT statements, the master will write those rows, but the slave will not. The slave's users will have no way to access the data. Similarly, you could use BLACKHOLE on the master and not on the slave.

Although you might think this engine cannot have bugs because it does nothing (as stated during a MySQL conference by a member of the team), if you visit the MySQL bug page [1] and search for the string "blackhole," a few bugs do show up.

CSV , as the name says, allows you to use a CSV file as a table, which can be convenient when importing or exporting data and applying some transformation at the same time. However, you should note that the CSV engine is pretty limited: It does not support indexes or cache, so it is extremely slow with big datasets; NULL is not supported; and it does not allow transactions. In MariaDB, you can use the CONNECT engine (described below) instead.

FEDERATED was developed to allow the use of a table from a remote MySQL server as if it were on the local server. That is, if you connect to a MySQL instance and query a FEDERATED table, MySQL will forward the query to another server that contains the real data. The mechanism, transparent to the application, allows you to read and write data.

FederatedX is an evolution of FEDERATED that comes with MariaDB. Its main improvement is support for transactions.

ARCHIVE is a storage engine for compressed tables. Its compression rate is very good, but it comes with many limitations. For example, neither indexes nor transactions are supported. Also, this storage engine is append-only, with no way to delete or update rows.

Buy this article as PDF

Express-Checkout as PDF
Price $2.95
(incl. VAT)

Buy ADMIN Magazine

Get it on Google Play

US / Canada

Get it on Google Play

UK / Australia

Related content

comments powered by Disqus
Subscribe to our ADMIN Newsletters
Subscribe to our Linux Newsletters
Find Linux and Open Source Jobs

Support Our Work

ADMIN content is made possible with support from readers like you. Please consider contributing when you've found an article to be beneficial.

Learn More”>


		<div class=