Scale-out with PostgreSQL

Load Up

Distributed SQL

YugabyteDB is based both on proven work, as described in a Google paper on Spanner [1], and on unique innovation. In a two-layer architecture, the reused PostgreSQL in the SQL layer builds on the underlying DocDB layer, which is the distributed transactional key-value data storage layer. Applications can connect to any node and see a single logical database from there. Although the Yugabyte structured query language (YSQL, the PostgreSQL-compatible layer) is the main interface, it can be replaced (Figure 2). For example, Yugabyte has a Cassandra-like API that can be used if the user does not need full SQL support and instead wants to use the syntax and special features of Cassandra (e.g., TTL, or time to live).

Figure 2: The exchangeable API layers on top of the distributed storage engine are a special feature of YugabyteDB. © Yugabyte

The PostgreSQL back end, made ready for the cluster with additional code, parses the queries and executes them. The tables are split into shards by a hash or range function and distributed among the database nodes. The default sharding strategy includes a hash on the first column of the key and an automatic split as the table grows.

However, the PostgreSQL syntax has been extended to allow full administrative control. The statement

yugabyte=# create table emp (
  empno int, hiredate timestamp,
  primary key (empno hash)
  ) split into 42 tablets;

creates a table with hash sharding based on the empno column with 42 preset shards (known as tablets). The statement

yugabyte=# create index emp_hire
  on emp ( hiredate asc )
  split at values (
    ('2010-01-01'),
    ('2015-01-01'),
    ('2020-01-01')
  );

creates an index for the hiredate column that is split by range to allow a scan of the index range, for which the year ranges 2010 to 2020 are preset.

To avoid latency between nodes during joins, multiple tablets can be stored in the same shard so that all joins remain local. Some reference tables can be synchronized with multiple regions. With a replication factor (RF)=3 across three data centers or availability zones, the database can be used even if one of the data centers fails. Nodes can also be distributed as part of a multicloud or hybrid cloud setup (Figure 3).

Figure 3: Several options for the distribution of nodes. © Yugabyte

By default, the tables and indexes are distributed over the entire cluster. If latency or data management requires, distribution can be limited to a subset of the cluster, such as a specific region, achieved with the help of PostgreSQL partitioning functions and tablespaces. Tablespaces are used to map tables, indexes, and partitions to the cluster topology. The example in Listing 1 replicates a region (eu-west ) across three availability zones.

Listing 1

Replication

create tablespace "eu-west" with ( replica_placement= $$
  {
  "num_replicas": 3, "placement_blocks": [
    { "cloud": "cloud", "region": "eu-west1", "zone": "eu-west1a", "min_num_replicas": 1 },
    { "cloud": "cloud", "region": "eu-west1", "zone": "eu-west1n", "min_num_replicas": 1 },
    { "cloud": "cloud", "region": "eu-west1", "zone": "eu-west1c", "min_num_replicas": 1 }
  ]
  } $$);
alter table emp set tablespace = "eu-west";

Afterward, the tablets are realigned with this topology (in the last line). This online process moves tablets as they are read from and written to, and it all works thanks to replication.

Architecture

Given the technical challenges, not many distributed SQL databases provide full SQL functionality without limiting transactions to a single server.

Google Spanner started with limited SQL but later added protocol and syntax compatibility with PostgreSQL. It is still restricted to the Google Cloud platform. CockroachDB does the same thing but opens up to other platforms and provides some core functionality as open source. Its PostgreSQL compatibility still requires code changes when porting PostgreSQL. TiDB is compatible with MySQL.

In this article on distributed SQL with YugabyteDB, I focus on the YSQL API, even though many Cassandra converts use the Cassandra-like Yugabyte cloud query language (YCQL).

PostgreSQL Compatibility

The decision in favor of compatibility with PostgreSQL was driven by its popularity for OLTP applications and its consistency with the SQL standard. Having the same protocol and syntax and providing the same functionality is useful – not only for the portability of existing applications, but also for familiarity. YugabyteDB is a new database, and in the event of a problem, searching for an answer online (e.g., Stack Overflow) would not yield many hits.

However, when it comes to SQL issues, the knowledge base in the documentation, forums, and blogs regarding PostgreSQL also applies to YugabyteDB. Compatibility in this case means the ability to run the same applications and tools used with PostgreSQL. YugabyteDB provides all isolation levels with the same semantics as PostgreSQL to ensure the same behavior in production. Other distributed databases only implement the serializable isolation level, which requires the application to implement repetition logic.

The YSQL code layer is based on a fork of PostgreSQL that comes with many SQL functions and includes not only all current PostgreSQL features, but also the use of numerous PostgreSQL extensions. The places in the code that are specific to YugabyteDB are clearly marked. Look for IsYBRelation() in the code to get a good idea. This approach enables adaptation to the new features of future PostgreSQL versions that are released every year.

« Previous 1 2 3 Next »