DynamoDB vs MongoDB vs Cassandra for Fast Growing Geo-Distributed Apps
Amazon DynamoDB is a popular NoSQL database choice for mid-to-large enterprises. In this post, we look beyond Amazon’s marketing claims to explore how well DynamoDB satisfies the core technical requirements of fast growing geo-distributed apps with low latency reads, a common use case found in today’s enterprises. We examine the development, operational and financial consequences of working around the limitations of DynamoDB when attempting to “force-fit” for this use case. Finally, we compare and contrast alternatives such as MongoDB, Apache Cassandra and YugabyteDB, a high-performance, cloud native, distributed SQL database.
Our post 11 Things You Wish You Knew Before Starting with DynamoDB analyzes DynamoDB strengths and weaknesses in depth. For this post, we use the DynamoDB home page to review some of the basics.
Amazon DynamoDB is a fast and flexible non-relational database service for all applications that need consistent, single-digit millisecond latency at any scale. It is a fully managed cloud database and supports both document and key-value store models.
What use cases are suitable for DynamoDB? Once again, from the home page:
Its flexible data model, reliable performance, and automatic scaling of throughput capacity make it a great fit for mobile, web, gaming, ad tech, IoT, and many other applications.
Examples of applications highlighted above are:
- E-commerce apps with product catalog, user personalization and online checkout such as Macys.com, Nike.com and Wayfair.com
- Online games such as Minecraft and League of Legends
- Ad tech software such as AppNexus and AdRoll
- Application & infrastructure monitoring services such as AppDynamics, New Relic and Datadog
- IoT apps such as Nest, Ring and SimpliSafe
These are arguably the most popular use cases for which developers attempt to make DynamoDB work.
Common App Characteristics
Many of the above examples are geo-distributed applications that are fast-growing and require low-latency.
Users access these apps from multiple geographic regions. Globally distributed app is a specific case with regions spread across multiple continents.
Users are either directly or indirectly generating new data at a fast rate. Unbounded data growth refers to the specific case where it is not possible to limit the data generated.
Users expect the web and mobile UIs of these apps to load extremely fast. Real-time refers to the specific case where data is served as soon as it is generated.
Apps with above characteristics impose multiple mandatory requirements on the database layer. These requirements can be categorized by different stakeholders within an organization: app development, cloud operations and business owners.
- Document data model — these applications need to model objects that have a dynamic set of attributes.
- Low latency reads — since the users of these apps are geo-distributed, the database should support sub-ms latency and timeline consistent reads from the various geographic regions.
- ACID transactions and secondary indexes — the database should provide the ability to perform updates for a single key as well as multiple keys with ACID guarantees. It should also consistently perform low-latency lookups by various attributes.
- Global write consistency with HA — these apps generate writes from different geographic regions, that should be persisted with strong consistency and high availability.
- Linear scalability —these applications need to scale-out as well as scale-in read/write throughput on demand.
- Ability to handle high data density — ever-growing datasets such as time series metrics, events and audit logs.
- Operational ease — backups, scaling & rebalancing of clusters, ability to change deployment configuration such as machine types and/or regions.
- Troubleshooting in production — ability to investigate production issues such as an outage or high latency. This could be due to a variety of reasons such as zone failures, network congestion or a software bug.
- Development agility — required to quickly build differentiating features to retain or attract customers. This is a major reason for adopting a microservices design.
- Operational agility — once features are built, they should be offered in the product to end customers quickly. The test-stage-release cycles should be quick. This is a big reason to use a managed service and build a robust Continuous Integration/Continuous Deployment (CI/CD) pipeline.
- Cost effectiveness — the entire, end-to-end solution (including the IaaS cost as well as the operational and maintenance overheads) should be cost effective.
Popular Alternatives to DynamoDB
Open source NoSQL databases that are often considered viable DynamoDB alternatives are MongoDB and Apache Cassandra. YugabyteDB is yet another emerging alternative.
MongoDB supports a document data model, is good for a microservices oriented design and works well in a CI/CD pipeline. Since there is an open-source version, it is possible to run test and dev instances in containers at a fraction of the cost of production instances. But it falls short because it does not support auto-sharding and multi-shard transactions. Additionally, it does not handle large datasets well and is not a very low latency database.
Apache Cassandra can handle data densities well, is good for a microservices oriented design and works well in a CI/CD pipeline. Again, since there is an open-source version, it is possible to run test and dev instances at a lower cost than production. But it too falls short because it cannot model documents, is not strongly consistent and is not operationally easy.
YugabyteDB is an open source high-performance SQL database with massive scalability, low latency and geo-distribution. As a Consistent and Partition-tolerant (CP) database with native JSONB document data type, high performance secondary indexes, cloud native operational ease and ability to handle high data density, it serves as an excellent alternative to DynamoDB, MongoDB and Apache Cassandra.
Scoring DynamoDB and the Alternatives
Let us examine how DynamoDB and its popular alternatives stack up against each other for the 11 requirements we identified. As we can see, YugabyteDB is the only solution that stands out as the ideal database to power fast growing geo-distributed apps with low latency.
For fast growing geo-distributed applications such as mobile, web, gaming, ad tech, and IoT, YugabyteDB was built ground up to satisfy the primary development and operational requirements. As the table below shows, YugabyteDB is able to do so at 10x savings compared to DynamoDB.
And with built-in distributed cache and native distributed transactions, a separate in-memory cache and a separate RDBMS are no longer needed. This leads to 3x development agility than a real-world DynamoDB deployment.
- If you are not yet convinced about the challenges of DynamoDB, read our post 11 Things You Wish You Knew Before Starting with DynamoDB.
- Learn more on how YugabyteDB achieves 3x agility of DynamoDB at only 1/10th cost.
- Download YugabyteDB and get started on your local machine.