Lead Image © Claudia Paulussen, Fotolia.com

Lead Image © Claudia Paulussen, Fotolia.com

Take your pick from a variety of AWS databases

Choose Carefully

Article from ADMIN 40/2017
We look at the variety of databases available in Amazon Web Services – from relational, to NoSQL, to data warehouses for petabyte data volumes.

For quick and easy access to databases in the cloud, you will find the most popular types in the form of a database-as-a-service (DBaaS). In addition to relational databases, NoSQL alternatives enjoy growing popularity. Each database type weighs aspects such as flexibility, read and write speed, resilience, license costs, scalability, and maintainability differently.

Amazon RDS: Managed SQL in the Cloud

Although relational database products differ widely in terms of details, similarities in properties and management processes can be abstracted. The Amazon Relational Database Service (RDS) provides a common interface (API) that encapsulates vendor-specific aspects. I look at the shared features and benefits of six RDS engines – Amazon Aurora, Oracle, Microsoft SQL Server, PostgreSQL, MySQL, and MariaDB – before going into the specifics of Amazon Aurora.

The RDS API simplifies administrative processes by orchestrating the necessary procedures at the database (DB) and infrastructure levels. For example, to create a DB instance, you do not have to deploy a server explicitly or run the product-specific installation. Instead, a new DB instance is available within minutes. The RDS API is either accessed from the AWS Management Console or used in automation scripts via a Software Development Kit (SDK), with which you can implement approaches such as infrastructure-as-code or use it to respond automatically to events.

In RDS, the database can consist of a single DB instance, which makes economic sense for testing and development purposes. For production use, you can operate the database with two DB instances in a primary-standby configuration. The instances are distributed across different availability zones (AZs) to cover different risk profiles within a region and are handled like separate data centers. This configuration is known as RDS Multi-AZ deployment.

Changes to the primary instance replicate synchronously to the standby. In a failover case, users can switch without loss of data. For this to happen, the DNS canonical name (CNAME) record is changed to point to the standby instance, which then acts as the new primary instance. Therefore, the DNS entry should not be cached by clients for longer than 60 seconds. RDS provides other features that improve reliability for critical production databases, including automated backups and database snapshots. In Multi-AZ mode, the snapshots can be created by the standby instance without any interruptions.

Vertical and Horizontal Scalability

Several kinds of scalability are supported: Currently, RDS scales vertically by changing the instance type of a virtual CPU (vCPU) and 1GB of RAM to up to 40 vCPUs and 244GB of RAM. If the database is operated in Multi-AZ mode, it is not available during automatic failover. The storage for the DB instance can be scaled in terms of storage type (e.g., General Purpose (SSD) or Provisioned IOPS (SSD) [i.e., input/output operations per second]) and storage size. The storage can extend up to 6TB (for Aurora, 64TB) and does not affect availability.

Many Amazon RDS engines scale horizontally by distributing read traffic across multiple Read Replica instances [1]. You can create up to five Read Replicas with a simple call. RDS takes care of the operations for creating the new DB instance and replicating the data. The Read Replicas are addressed by clients through their individual connection endpoints.

To restrict access at the network level, locating the instances in an Amazon Virtual Private Cloud (VPC) is recommended. Thanks to network access control lists, DB security groups (stateful firewalls), IPsec VPNs, and routing tables, the traffic can be controlled in a very granular manner.

As with most AWS services, the decision as to where the DB instance is created (i.e., in which of the four regions) is made when calling the API. Thus, the data remains within the selected region to implement regulatory requirements or optimize latency to the database client. Additionally, many types of Amazon RDS engines offer data encryption during transmission and storage. During storage, a simple hook when creating a new DB instance is sufficient in the simplest case. Billing is based on hourly usage, which, if predictable in the long term, could make it worth your while to look at reserved instances.

Individual Features for RDS

The unified approach for setting up and managing the different RDS engines does not prevent you from making individual settings: Defaults give you a quick start. More fine tuning is possible via DB parameter groups: Up to 300 parameters can be tweaked, depending on the RDS engine and version, and applied to DB instances (Figure 1). Like DB parameter groups, individual product features can be defined in DB option groups (e.g., transparent data encryption) and applied to DB instances.

Figure 1: Parameter groups let you define engine-specific parameters tailored to the application, which you can then apply to DB instances.

Buy this article as PDF

Express-Checkout as PDF
Price $2.95
(incl. VAT)

Buy ADMIN Magazine

Get it on Google Play

US / Canada

Get it on Google Play

UK / Australia

Related content

comments powered by Disqus