Data up in the clouds.

Data is the oil of today and the cloud is the best place for the refinery.   Just as there are many different kinds of refineries for turning oil into gas, diesel, and kerosin, there are many different ways to store your data up in the cloud.

Let’s take a look at some of the ways data can be stored in the Amazon AWS Cloud.

For simple file and object storage AWS has Elastic Block Store (EBS),  Elastic File System (EFS), and  Simple Storage Service (S3).     For this post let’s just look at the different possibilities for databases on AWS.

Build and Maintain it Yourself

If you are an experienced database administrator (DBA) and like to have full control over the deployment of your database, you can create an Amazon Elastic Compute Cloud(EC2) instance.  You would then have an IaaS that you could use just like any other server to install, configure, maintain your database.

If you wanted to make your life a little bit easier, you could select an EC2 instance from the AWS Market Place  that already has your desired database installed.  You would be able to make any configuration changes you want and would still need to maintain it just like you would any other database you installed yourself.

Let AWS Build and Maintain it

If you are not an experienced DBA, or simply do not want the task of installing and configuring the database yourself, then the AWS Relational Database Service (RDS)   could be the solution for you.  It is basically a PaaS  that takes over most of the normal responsibilities of a DBA, like the configuration, patches, backups, etc.  With just a few clicks the database is up, running, and ready to be used.  There are six database engines to choose from, including Amazon Aurora, PostgreSQL, MySQL, MariaDB, Oracle Database, and SQL Server.

No need to Build or Maintain it: Serverless

The solutions mentioned above will be up and running 24 hours a day, 7 days a week.  Which means you will be charged for their continuous availability.  There are many applications that do not need a full time server.  Here are some examples:

  • Infrequently used applications   
  • New applications
  • Variable workloads
  • Unpredictable workloads
  • Development and test databases

 

If your application falls into any of the above categories, then a serverless database could be the best choice.   AWS  has a variety of  very powerful options to chose from.  The databases listed below are fully managed and serverless.  These can also be called DBaaS (Database as a Service).

DynamoDB   is a very high performance NoSQL Key-Value database.

Aurora Serverless   is for Online Transaction Processing (OLTP) using a traditional relational database management system (RDMS).    You have a choice of using Aurora based on  MySQL,   which AWS says is 5 times faster than MySQL,  or you can use Aurora based on PostgreSQL ,  which AWS says is 3 times faster than the normal PostgreSQL.

Redshift   is for Online Analytical Processing (OLAP) and Data Warehousing.  It started with an older version of PostgreSQL.  AWS claims it is the fastest cloud data warehouse, 3 times faster and 50% cheaper than other cloud data warehouses.

Neptune is for graph databases (GDB).  It is good for recommendation engines, fraud detection, knowledge graphs, drug discovery, and network security.

Apache Casandra Service   is designed to handle large amounts of data across many commodity servers, providing high availability with no single point of failure. Cassandra offers robust support for clusters spanning multiple datacenters, with asynchronous masterless replication allowing low latency operations for all clients.    A good choice for Big Data applications.

Timestream  is specifically design for IoT and time-series data.

DocumentDB  is a fast, scalable, highly available, and fully managed document database service that supports MongoDB workloads.

Quantum Ledger Database (QLDB)  provides a transparent, immutable, and cryptographically verifiable transaction log ‎owned by a central trusted authority. It tracks each and every application data change and maintains a complete and verifiable history of changes over time.  It is very similar to but not exactly a Blockchain.  AWS also has a Managed Blockchain.

In Memory

If your application has a very high data throughput and needs extra low latency, then an in memory cache between your application and database could be the solution.  AWS has ElasticCache with the option of using MemcacheD  or Redis.   The DynamoDB has a in memory cache built just for it called DynamoDB Accelerator (DAX).

This has only been a very brief overview of the AWS database options.  A whole book could be written on this subject.  To dive deeper into the subject,  clustering, read replications, restores, backups, pricing, and fault tolerance should also be looked at.

Oracle released the first commercial relational database in 1979 and has held the lead as the most used database since then.  Amazon made available the AWS cloud in 2006 and still has over half of the cloud market.  The chart in DB-Engines Ranking has Oracle in the number one spot.  You have to scroll down a bit until you start to see any of the AWS databases.  The big difference is that Oracle is quickly losing it’s popularity and AWS is quickly gaining.  Sometime in the near future Oracle will lose and never again hold the first spot as the most popular database.  Amazon like most large corporations was a big Oracle user.  This October 2019 they shut down their last Oracle database and saved 60% in costs by doing it.

Data is the oil of today and in my opinion the AWS cloud with so many different database options is the best place to refine it!

William Worthington

Beitrag teilen

Ein Gedanke zu “Data up in the clouds.

Schreibe einen Kommentar