Photo by Eugenio Mazzone on Unsplash

Photo by Eugenio Mazzone on Unsplash

RSQL, a Microsoft SQL clone

Data Stacker

Article from ADMIN 45/2018
By , By , By
The open source relational database RSQL aims to completely replace slimmed-down instances of Microsoft SQL Server.

Database systems (DBSs) have existed for more than 60 years. Such software stores electronic data in a structured and permanent manner, usually in the form of tables with data records of a fixed size. The second task of a DBS, handled by the database management system (DBMS), is to retrieve this data efficiently and bring it to light as required. The DBMS organizes the storage of data within the DBS and checks all read and write accesses to the individual databases, taking into account the authorizations assigned to the various database users.

The data is accessed with the use of a query language, such as a structured query language (SQL), extended structured query language (XSQL), or XML query language (XQuery). The central DBMS discussed in this article and known as RSQL [1] implements a subset of Transact-SQL (T-SQL) [2], which is a proprietary extension of the SQL standard by Microsoft and Sybase. This SQL dialect is characterized by procedural programming, local variables, error handling, string processing, and mathematical operations [3].

Communication between the DBS and the outside world can be established using either a database console in a terminal, intermediary software in the form of an Open Database Connectivity (ODBC) or Java Database Connectivity (JDBC) driver, or a suitable library. On Linux systems, the libdbi [4] framework is usually used as the corresponding library. RSQL provides a C# driver, which you can use as a C# developer to communicate directly with the DBS [5]. The driver is functionally identical to the C# SQL Server driver from Microsoft, so it is sufficient to replace

using System.Data.SqlClient;

with

using Rsqldrv.SqlClient;

in your C# program.

RSQL

RSQL is the brainchild of Geneva-based developer Nicolas Riesch (see the "Nicolas Riesch Interviewed" box). As an engineer, he was in charge of database systems for an insurance company for more than a decade, especially in combination with Microsoft (MS) SQL Server. He observed that most of the required reports were comparatively simple and used only a fraction of the possibilities that the DBS offered. Only a few projects required events, triggers, views, and stored procedures. A passive database was sufficient, especially for backups.

Nicolas Riesch Interviewed

Why did you develop your own DBS?

Nicolas Riesch: I have been working with MS SQL Server for more than 15 years and like the product, but the license costs are exorbitant; for example, $14,400 for installation on an eight-core machine whose hardware only costs $3,000. The more powerful the machine, the more expensive the license, because its costs are proportional to the number of cores. In practice, you always have to achieve an unsatisfactory balance between the license costs and the performance of the hardware. My dream was thus to write an MS SQL clone that I could install on any machine without any limitations. After a great deal of research on the internal mechanisms of a database server, I was able to write my own implementation.

The Go programming language: Looking back, was that a good decision?

NR: Originally, I developed RSQL in C, but when I heard about Go, I decided to change. Go was designed by famous developers like Rob Pike and Ken Thompson; practically, it is the successor of C and C++. It offers a simple but powerful syntax, is compiled, and has a garbage collector. It facilitates concurrent programming and threads. It was originally designed to write server software more efficiently but is suitable for a variety of programming tasks. It is fun to program in Go.

What design principles does RSQL follow? Are there models?

NR: To my knowledge, there is hardly any documentation about the internal design of database servers. You have to figure it out for yourself. On the other hand, this opens up the possibility of finding optimal solutions without following predetermined ideas. The first step is to write a compiler that parses SQL text into bytecode, which then runs in a virtual machine (VM). The VM is more or less a gigantic switch statement within a loop; everything is built around the compiler as the central part of the software.

What is RSQL designed to achieve, and what not? What kind of deployment framework do you have in mind?

NR: RSQL is a replacement product, mainly intended for developers on the Microsoft platform who already know MS SQL Server and develop in C#. It offers the possibility of massive savings in licensing costs. There is no learning curve because RSQL implements Transact-SQL, the SQL dialect of the MS SQL Server. As a simplified clone of the MS SQL Server, RSQL will never be as complex as the original product. It does not currently offer stored procedures, views, or other extended properties, for example. However, in practice, most applications are quite simple. If you only use the SELECT, INSERT, UPDATE, and DELETE statements, you will be happy with RSQL. In case of doubt, you can switch back to MS SQL Server at any time – an SQL script that runs on RSQL will also run on MS SQL Server. A C# driver also implements the same methods as the original driver.

RSQL has a development time of five years and 140,000 lines of program code. Did you think this would be easier?

NR: When I started, I was quite prepared for three or four years of work. I then spent a great deal of time simplifying the code and rewriting many parts several times. Elegant program code is easier to maintain, more efficient, and less error-prone, but also more difficult and time-consuming to write. Together with operating systems and compilers, database servers are among the most demanding forms of software. Cloning also requires extra work because you have to imitate the look and feel of the original product. The "clone" part of RSQL alone contains about 50 percent of the code.

Which part of the program has caused you the most headaches so far?

NR: The largest and most complex components of RSQL are the RSQL compiler, the cache (to process memory buffers and transactions), and the B-tree management code (for reading and writing records to and from files). I always think about the overall design, which should be as simple and elegant as possible. I believe that programming is a creative art with a deep sense of aesthetics. In my opinion, code runs quickly and smoothly if it is elegant. If you want to write a server, I can only advise you to implement it in Go, but you should also be proficient in C, the mother tongue of all programming languages.

What help and support does the project need? Which construction sites are currently in focus?

NR: I need feedback from users who install and try out RSQL. I would like to hear about their first impressions and potential bugs. At the moment, the focus is on consolidating the state of RSQL and polishing all the rough edges. By the end of 2017, I want to implement the IMAGE and TEXT data types so you can save PDFs and images. Features like an optimizer that re-sorts tables in a FROM statement are also missing, but that's a large amount of work, and it won't be until 2018.

In contrast, the licensing costs for the DBS were enormous and not reasonably related to the costs of hosting at the data center. Running multiple instances – usually the proven trio for development, testing, and production – becomes very expensive, especially for smaller companies with existing projects. Switching to MariaDB, MySQL, or PostgreSQL avoids the cost factor of licenses, but at the price of increasing the testing overhead.

Riesch implemented the desire to run existing SQL queries without changes in the form of RSQL. After five years of development, 140,000 lines of code in Go are now available, and they cover Riesch's requirements. You can log in to RSQL with a username and password combination. Currently, the limit is 64 parallel connections per RSQL instance (multiuser; see Table 1 for other criteria).

Table 1

RSQL Key Technical Specs

Criterion RSQL Parameters
Databases per RSQL instance No limit on size and number
Tables per database No limit on number or size of tables
Rows and columns per table No limit on the number of data records
Data record size  8KB max
Data types (current) varbinary, char, varchar, bit, tinyint, smallint, int, bigint, money, numeric, float, date, time, datetime
Data types (future) image, text
Import/export format CSV with fixed separator
Database functions MS SQL Server-like functions and operators
Backup Full backup; format not identical to MS SQL Server
Replication Currently not supported
Cluster mode Currently not supported
Transaction security Yes (rollback on DBS restart)
Multiuser mode Yes

Using RSQL

The basic requirements for using RSQL are a 64-bit system with only 100MB of RAM; 100MB of disk space is sufficient. Basic knowledge of SQL and its syntax are useful for operations. Installation instructions are available in variants for a quick start and for a complete client-server system.

RSQL is not yet available in packaged form. The archive of the current 0.7.1 version contains the server and client in the form of compiled Go files, with no other software dependencies. When unpacking the archive (Listing 1, line 1), the rsql/bin/ folder is created with the two binaries. To save typing, rename the server and client (lines 2 and 3).

Listing 1

Setting Up RSQL

01 $ tar -xzf rsql-0_7_1-linux_amd64.tgz
02 $ mv rsql/bin/rsql_server-0_7_1-linux_amd64 rsql/bin/rsql_server
03 $ mv rsql/bin/rcli-0_7_1-linux_amd64 rsql/bin/rcli
04 $ ls rsql/bin/
05 rsql_server rcli
06
07 $ mkdir rsql-test
08 $ ./rsql/bin/rsql_server -install -dir=/home/<user>/rsql-test -server_default_collation=en_ci_ai -server_default_language=en_us
09 $ ./rsql/bin/rsql_server -dir=/home/<user>/rsql-test

In the next step, you create a project directory, simply named rsql-test/ here (line 7); initialize the DBS (line 8) by replacing /home/<user>/ with the name of your home directory; and adjust the desired localization. Figure 1 shows the structures created during this step in the terminal window on the right. Now start the RSQL server at the command line (line 9).

Figure 1: RSQL in use. The server is already running in the background; the test directory (right) contains the required structures.

The server now runs in the background. You can stop it at any time by pressing Ctrl+C. If you have not changed anything, RSQL listens on port 7777. When you launch the RSQL client, it automatically connects to the server process via this port. In another terminal, send a request from the client to the RSQL server (Listing 2, line 1).

Listing 2

Client Operations

01 $ ./rsql/bin/rcli -U=sa -P=<password> -Q="print 'hello world';"
02 $ ./rsql/bin/rcli -config_model > ~/rcli.conf
03 $ ./rcli -d <Database> batch.sql

In the call, the connection is made as the (already existing) sa user (-U switch) with the corresponding password (-P switch) and the command to be executed ("query" -Q switch), which in the example outputs the character string "hello world" (Figure 2). However, with this form of call, the authentication data ends up in the shell's history, which is undesirable. Therefore, transfer it to a configuration file with the call in Listing 2, line 2.

Figure 2: A "hello world" database query in RSQL.

As in Listing 2, you can send SQL commands to create tables in the database, fill them with content, and query the content. Figure 3 uses a small phone book to show how RSQL handles this. The -d switch specifies the database used.

Figure 3: Joining two tables.

Documentation

On the project website you will find a tutorial to help you learn how to use the software and an overview of the database commands, operators, and functions. RSQL does not yet have a man page, but it at least has integrated help for the individual switches. Only switches of the -s short form are documented, but the more common --switch variant is also allowed.

The project website also provides ready-made database content that impressively demonstrates the performance of RSQL and is worth trying out.

So far, RSQL does not have a database console like MySQL and PostgreSQL, nor does it have a graphical front end. All output is sent to the terminal. If you do not want to enter each query individually but want to transfer all queries in a batch job, then you can save them in a text file and point to it when calling RSQL (Listing 2, line 3).

Buy this article as PDF

Express-Checkout as PDF
Price $2.95
(incl. VAT)

Buy ADMIN Magazine

SINGLE ISSUES
 
SUBSCRIPTIONS
 
TABLET & SMARTPHONE APPS
Get it on Google Play

US / Canada

Get it on Google Play

UK / Australia

Related content

comments powered by Disqus