Sharding and scale-out for databases

Shards

What About Competition?

As I mentioned earlier, ShardingSphere is a comprehensive solution for dragging databases into the present day and adding cloud-ready capabilities and scalability. How does the tool fare compared with its competitors? The question is not so easy to answer, precisely because there are now a large number of complex solutions in the market segment that ShardingSphere addresses. However, their underpinnings differ from those of ShardingSphere – in some cases considerably.

The closest thing to ShardingSphere is Vitess [3], which implements its own MySQL-specific sharding with a similar component structure. Again, it does not implement its own storage but accesses MySQL instances in the back end and then uses them to distribute its own logical database namespace. Unlike ShardingSphere, however, Vitess specializes in MySQL and completely lacks support for PostgreSQL.

Other solutions such as YugabyteDB [4] do offer this support. YugabyteDB operates as a classic key-value store under the hood but exposes its structures to the outside world with a PostgreSQL compatibility layer created specifically for this purpose. In the worst case, this setup will cause an application to fail if it tries to use a PostgreSQL feature that the YugabyteDB replica cannot handle. ShardingSphere takes a smarter approach because it uses actual PostgreSQL or MySQL databases in the background. Anyone looking for an add-on solution for distributed databases will want to include ShardingSphere in their evaluation.

However, ShardingSphere does not immediately cover the issue of implicit high availability. The competitors are far more advanced in this case: Vitess, for example, can replicate the individual shards of a node to other nodes and seamlessly exchange each of these during operation. The same is true for the key-value database at the heart of YugabyteDB.

Future Outlook

The ShardingSphere developers consider their work far from complete and are already working on a third flavor of their product: ShardingSphere-Sidecar (Figure 5). Anyone who works in the container and Kubernetes environment will already have some idea of where this product is headed. Sidecar is said to offer ShardingSphere capabilities while coming across as a cloud-native service and integrating seamlessly with container fleets.

Figure 5: The JDBC driver and Proxy will soon be joined by ShardingSphere-Sidecar, which is optimized for operating the solution in clouds. However, a production-ready version was not available when this magazine went to press. © ShardingSphere

Sidecar works in close cooperation with the Proxy, which is essential in a Sidecar setup that uses ShardingSphere. Thanks to an integrated mesh functionality, the individual applications no longer communicate directly with the Sharding Proxy, but with a local instance of a Sharding Mesh Sidecar. The active Mesh Sidecars, in turn, forward the data to ShardingSphere's Sharding Sidecars, which ultimately communicate with the database back ends.

At the time of writing, ShardingSphere-Sidecar was not released for production; you will have to make do with alternatives for now, which could mean, for example, combining the ShardingSphere-Proxy server with a mesh (e.g., the Istio service mesh).

Conclusions

ShardingSphere is a comprehensive and powerful tool for upgrading conventional-style databases for today's scalable IT world. Unlike its competitor YugabyteDB, for example, it does not try to reinvent the wheel: ShardingSphere lets applications talk to a real MySQL database instead of a translation layer.

However, the individual ShardingSphere components noticeably are not fully feature-compatible with each other. That the JDBC implementation was the first is noticeable by the fact that it still offers the most features. However, this does not demote ShardingSphere-Proxy to a second-class component. From the developer's or administrator's point of view, it is important to choose carefully between the variants. If JDBC is available within the scope of a project anyway, this option is a good choice.

Either way, if you want to look beyond ready-made boxed solutions for scalable databases, ShardingSphere should be on your list of solutions to evaluate, as should Vitess, which takes a similar approach for MySQL.

Infos

  1. Apache ShardingSphere: https://shardingsphere.apache.org
  2. Li, R., L. Zhang, J. Pan, J. Liu, P. Wang, N. Sun, S. Wang, C. Chen, F. Gu, and S. Guo. Apache ShardingSphere: A Holistic and Pluggable Platform for Data Sharding: VIII. Evaluations. In: Proceedings of 2022 IEEE 38th International Conference on Data Engineering (ICDE) (IEEE, May 2022), pp. 2468-2480: http://www.kangry.net/paper/ICDE2022_SS.pdf
  3. Vitess: https://vitess.io
  4. YugabyteDB: https://www.yugabyte.com

The Author

Freelance journalist Martin Gerhard Loschwitz focuses primarily on topics such as OpenStack, Kubernetes, and Chef.

Buy this article as PDF

Express-Checkout as PDF
Price $2.95
(incl. VAT)

Buy ADMIN Magazine

SINGLE ISSUES
 
SUBSCRIPTIONS
 
TABLET & SMARTPHONE APPS
Get it on Google Play

US / Canada

Get it on Google Play

UK / Australia

Related content

comments powered by Disqus
Subscribe to our ADMIN Newsletters
Subscribe to our Linux Newsletters
Find Linux and Open Source Jobs



Support Our Work

ADMIN content is made possible with support from readers like you. Please consider contributing when you've found an article to be beneficial.

Learn More”>
	</a>

<hr>		    
			</div>
		    		</div>

		<div class=