When moving your data and applications to AWS or developing a solution from scratch, you are most likely going to be faced with the challenge of selecting the most appropriate of the multitude of database options offered by AWS.
There are no shortage of database options with each having their own strengths and weaknesses. The challenge is matching the capabilities and suitability of the databases on offer to your particular use case in both the development prototyping and production phases.
The AWS database solutions on offer at time of writing include :
Aurora | Relational | Apps, ERP, CRM, E-Comm | |
AWS RDS (MySQL, PostgreSQL) | Relational | Apps, ERP, CRM, E-Comm | |
Redshift | Data Warehouse | Big Data | |
DynamoDB | Key-Value | High Traffic Web Apps, E-comm | |
Elasticache for Redis | In Memory | Caching, Session Mgt, Geospatial | |
Elasticache for Memcached | In Memory | Caching, Session Mgt, Geospatial | |
DocumentDB | Document | Content Mgt, Catalog, User Profiles | |
KeySpaces for Apache Cassandra | Wide Column | High Speed Scale Industrial Apps | |
Neptune | Graph | Fraud Detection, Social Networks | |
TimeStream | Time Series | IoT, DevOps, Telemetry | |
Quantum Ledger (QLDB) | Ledger | Fully Managed Ledger Database |
The first question to ask yourself is do you need this database for Development Prototyping or production.
AWS Databases for Prototyping
If you are prototyping a new solution then there is a strong case for opting for a simple RDS instance with a MySQL or PostgreSQL database, as the chances are you will be more familiar with this style and can avoid learning a new language or object relational mapper (ORM) in order to use some of the other database options.
AWS Databases for Production
When selecting a database for production, there are a number of considerations that will steer you towards the best fit database choice.
Do you need a relational schema?
If your database tables need to contain relational data that references data in other tables, like say a client account number in a transactions table that references the client account in the "clients" table, then you'll need to steer back towards an RDS solution.
RDS Managed or Unmanaged?
If you want a solution where you control things like the ability to autoscale and add nodes to the database and use monitoring solutions then you should look at a managed solution, otherwise an unmanaged relational database should suffice.
Unmanaged AWS RDS Database options.
In an unmanaged database scenario, the same solution used for prototyping is appropriate. You can use an AWS RDS instance with your preferred DB engine like MySQL, PostgreSQL or MariaDB
Managed AWS RDS Database options.
All paths down the managed RDS route lead to Amazon Aurora RDS, which provides the ability to autoscale and allows you to create read replicas to scale out your cluster when required.
However from a cost perspective, the next question to consider relates to traffic. Do you anticipate steady traffic with some consistency, or traffic arriving in sporadic bursts?
If your traffic is sporadic, then you might like to consider Aurora RDS Serverless. This gives you all the features and advantages of Aurora but you won't be paying for machine time when the database is not in use. Aurora RDS Serverless will essentially scale up when under load and scale back down as traffic demand drops off.
Non Relational Database Options. (NoSQL).
Transactional or Non-Transactional
The remaining solutions for non relational (non text search) NoSQL databases fall into two categories.
Transactional
Transactional essential refers to updating multiple tables at the same time, where the update should complete all updates in one transaction to be considered successful.
Ease of Use vs Ability to Scale
When selecting a NoSQL transactional database option, there are two AWS options DynamoDB and DocumentDB which is MongoDB compatible. The decision comes down to balancing ease of use vs the ability to easily scale.
NoSQL Transactional - Ease of Use Option
The strengths of the DocumentDB fully managed (MongoDB compatible) database option are that operationally it is simple and straightforward with intuitive syntax which makes it ideal to get started.
NoSQL Transactional - Ease of Scalability
For an easily scaled NoSQL persistant database solution DynamoDB provides very good performance, transactional capability, and excellent cluster monitoring mechanisms like cloudwatch.
Non Transactional NoSQL
For a non transactional NoSQL option then an Elasticache solution can be used. You can select Elasticache for Memcached or Elasticache for Redis depending on your use case and familiarity.
Specialised Databases
There are a number of specialised AWS database solutions built to meet the unique needs of specific industries and applications. These include:
Text Search Services - ElasticSearch / CloudSearch
If you are providing a solution that requires text search capabilities, like highlighting, search autocomplete or geospatial search, then two options you have are either ElasticSearch or AWS CloudSearch. Each "database" service provides a niche solution enabling you to add rich search capabilities to your website or application by uploading JSON or XML documents to the service.
Amazon Keyspaces for Apache Cassandra
Amazon Keyspaces (for Apache Cassandra) is a scalable, highly available, and managed Apache Cassandra–compatible database service. With Amazon Keyspaces, you can run your Cassandra workloads on AWS using the same Cassandra application code and developer tools that you use today.
Typical use-cases are processing data at high speeds for applications that require single-digit-millisecond latency, such as industrial equipment maintenance, trade monitoring, fleet management, and route optimization.
Amazon Neptune
Amazon Neptune is a fast, reliable, fully managed graph database service that makes it easy to build and run applications that work with highly connected datasets. The core of Amazon Neptune is a purpose-built, high-performance graph database engine optimized for storing billions of relationships and querying the graph with milliseconds latency.
Amazon Neptune supports popular graph models Property Graph and W3C's RDF, and their respective query languages Apache TinkerPop Gremlin and SPARQL, allowing you to easily build queries that efficiently navigate highly connected datasets. Neptune powers graph use cases such as recommendation engines, fraud detection, knowledge graphs, drug discovery, and network security.
Amazon TimeStream
Amazon TimeStream is a fast, scalable, and serverless time series database service for IoT and operational applications that makes it easy to store and analyze trillions of events per day up to 1,000 times faster and at as little as 1/10th the cost of relational databases.
Amazon Quantum Ledger Database (QLDB)
Amazon QLDB is a fully managed ledger database that provides a transparent, immutable, and cryptographically verifiable transaction log owned by a central trusted authority. Amazon QLDB can be used to track each and every application data change and maintains a complete and verifiable history of changes over time.
Ledgers are typically used to record a history of economic and financial activity in an organization. Many organizations build applications with ledger-like functionality because they want to maintain an accurate history of their applications' data, for example, tracking the history of credits and debits in banking transactions or tracing movement of an item in a supply chain network.
Amazon Redshift
Amazon Redshift is a fully-managed petabyte-scale cloud based data warehouse product designed for large scale data set storage and analysis. It is also used to perform large scale database migrations.
We hope you found this breakdown of the available database products available for AWS useful. If you are building applications on the major cloud platforms like AWS, Azure and GCP and are not yet using Hava to automate your network topology diagramming and documentation, we invite you to take a free 14 day trial below.
Whether you could benefit from an automated AWS architecture diagram, Google Cloud / Azure network topology or combinations of all three, Hava.io has the solution.
We're confident you'll save a stack of time generating accurate topology diagrams and visualised infrastructure security. You might even discover some resources or vulnerabilities you were unaware of.
(No Credit Card Required)