DISCOVER MORE
FIND OUT MORE
READ NOW

Blogs by: Sid Choudhury

Apache Cassandra: The Truth Behind Tunable Consistency, Lightweight Transactions & Secondary Indexes

Apache Cassandra: The Truth Behind Tunable Consistency, Lightweight Transactions & Secondary Indexes

ACID transactions were a big deal when first introduced formally in the 1980s in monolithic SQL databases such as Oracle and IBM DB2. Popular distributed NoSQL databases of the past decade including Apache Cassandra initially focused on “big data” use cases that did not require such guarantees and hence avoided implementing them altogether. Our post, “A Primer on ACID Transactions: The Basics Every Cloud App Developer Must Know” details the various types of ACID transactions (single key,

Read more

Google Spanner vs. Calvin: Is There a Clear Winner in the Battle for Global Consistency at Scale?

Google Spanner vs. Calvin: Is There a Clear Winner in the Battle for Global Consistency at Scale?

Prof. Daniel Abadi, lead inventor of the Calvin transaction management protocol and the PACELC theorem, wrote a thought-provoking post last month titled “NewSQL database systems are failing to guarantee consistency, and I blame Spanner”. The post takes a negative view of software-only Google Spanner derivative databases such as YugabyteDB and CockroachDB that use Spanner-like partitioned consensus for single shard transactions and a two phase commit (2PC) protocol for multi-shard (aka distributed) ACID transactions.

Read more

Building a High Growth Business by Monetizing Open Source Software

Building a High Growth Business by Monetizing Open Source Software

Whenever a venture-funded software infrastructure startup takes the open source route to market, a few questions emerge:

  • What open source license and project governance model will it choose?
  • How will it monetize the open source project?
  • What if AWS, Microsoft Azure or Google Cloud offer the startup’s open source project as a managed service?

At Yugabyte, we answered these questions for the open source YugabyteDB project in the following way:

  • Distributed under the highly permissive Apache 2.0 license and managed under an open self-governance model.

Read more

Understanding How YugabyteDB Runs on Kubernetes

Understanding How YugabyteDB Runs on Kubernetes

As we reviewed in “Docker, Kubernetes and the Rise of Cloud Native Databases”, Kubernetes has benefited from rapid adoption to become the de-facto choice for container orchestration. This has happened in a short span of only 4 years since Google open sourced the project in 2014. YugabyteDB’s automated sharding and strongly consistent replication architecture lends itself extremely well to containerized deployments powered by Kubernetes orchestration. In this post we’ll look at the various components involved in getting YugabyteDB up and running as Kubernetes StatefulSets.

Read more

How Does the Raft Consensus-Based Replication Protocol Work in YugabyteDB?

How Does the Raft Consensus-Based Replication Protocol Work in YugabyteDB?

Editor’s note: This post was originally published August 8, 2018 and has been updated as of May 28, 2020.

As we saw in ”How Does Consensus-Based Replication Work in Distributed Databases?”, Raft has become the consensus replication algorithm of choice when it comes to building resilient, strongly consistent systems. YugabyteDB uses Raft for both leader election and data replication. Instead of having a single Raft group for the entire dataset in the cluster,

Read more

How Does Consensus-Based Replication Work in Distributed Databases?

How Does Consensus-Based Replication Work in Distributed Databases?

Editor’s note: This post was originally published August 2, 2018 and has been updated as of May 26, 2020.

Whether it be a WordPress website’s MySQL backend or Dropbox’s multi-exabyte storage system, data replication is at the heart of making data durable and available in the presence of hardware failures such as machine crashes, disk failures, network partitions, and clock skews. The basic idea behind replication is very simple: keep multiple copies of data on physically isolated hardware so that one hardware failure does not impact the others;

Read more

New to Google Cloud Databases? 5 Areas of Confusion That You Better Be Aware of

New to Google Cloud Databases? 5 Areas of Confusion That You Better Be Aware of

After billions of dollars in capital expenditure and reference customers in every major vertical, Google Cloud Platform has finally emerged as a credible competitor to Amazon Web Services and Microsoft Azure when it comes to enterprise-ready cloud infrastructure. While Google Cloud’s compute and storage offerings are easier to understand, making sense of its various managed database offerings is not for the faint-hearted. This post introduces app developers to the major Google Cloud database services,

Read more

Implementing Distributed Transactions the Google Way: Percolator vs. Spanner

Implementing Distributed Transactions the Google Way: Percolator vs. Spanner

Our post 6 Signs You Might be Misunderstanding ACID Transactions in Distributed Databases describes the key challenges involved in building high performance distributed transactions. Multiple open source ACID-compliant distributed databases have started building such transactions by taking inspiration from research papers published by Google. In this post, we dive deeper into Percolator and Spanner, the two Google systems behind those papers, as well as the open source databases they have inspired.

Read more

6 Signs You Might be Misunderstanding ACID Transactions in Distributed Databases

6 Signs You Might be Misunderstanding ACID Transactions in Distributed Databases

As described in A Primer on ACID Transactions, first generation NoSQL databases dropped ACID guarantees with the rationale that such guarantees are needed only by old school enterprises running monolithic, relational applications in a single private datacenter. And the premise was that modern distributed apps should instead focus on linear database scalability along with low latency, mostly-accurate, single-key-only operations on shared-nothing storage (e.g. those provided by the public clouds).

Application developers who blindly accept the above reasoning are not serving their organizations well.

Read more

A Busy Developer’s Guide to Database Storage Engines — The Basics

A Busy Developer’s Guide to Database Storage Engines — The Basics

Editor’s note – This is the first part of a two-part series on database storage engines:

When evaluating operational databases, developers building distributed cloud applications tend to focus on data modeling flexibility, consistency guarantees, linear scalability,

Read more

Explore Distributed SQL and YugabyteDB in Depth

Discover the future of data management.
Learn at Yugabyte University
Learn More
Browse Yugabyte Docs
Read More
Distributed SQL for Dummies
Read for Free