Start Now

The Distributed SQL Blog

Thoughts on distributed databases, open source, and cloud native

Jepsen Testing on YugabyteDB

At YugaByte, our mission is to build a robust, reliable, distributed OLTP database. Needless to say, we take correctness and technical accuracy of our claims very seriously. Therefore, we absolutely love a testing framework like Jepsen which helps verify correctness and are fans of Kyle Kingsbury’s work!

Here is a summary of what we have done so far in regards to Jepsen:

  • We have performed our own DIY style Jepsen testing
  • The YugabyteDB Jepsen testing repository is open source
  • For the current suite of Jepsen tests for YugabyteDB that we have tested in a loop,

Read more

Building a High Growth Business by Monetizing Open Source Software

Whenever a venture-funded software infrastructure startup takes the open source route to market, a few questions emerge:

  • What open source license and project governance model will it choose?
  • How will it monetize the open source project?
  • What if AWS, Microsoft Azure or Google Cloud offer the startup’s open source project as a managed service?

At Yugabyte, we answered these questions for the open source YugabyteDB project in the following way:

  • Distributed under the highly permissive Apache 2.0 license and managed under an open self-governance model.

Read more

YugaByte Database Engineering Update – August 20, 2018

Time for another update from the engineering team at YugaByte! It has been a month since the last update, so let’s dive right in.

Community News

Recent Events

On the conference front, YugaByte was at Google Next towards the end of July. YugabyteDB was already very well integrated into the Google Cloud Compute ecosystem, we additionally announced support for Google Container Engine (GKE). You can also read about how YugabyteDB compares with the various Google Cloud databases.

Read more

Understanding How YugabyteDB Runs on Kubernetes

As we reviewed in “Docker, Kubernetes and the Rise of Cloud Native Databases”, Kubernetes has benefited from rapid adoption to become the de-facto choice for container orchestration. This has happened in a short span of only 4 years since Google open sourced the project in 2014. YugabyteDB’s automated sharding and strongly consistent replication architecture lends itself extremely well to containerized deployments powered by Kubernetes orchestration. In this post we’ll look at the various components involved in getting YugabyteDB up and running as Kubernetes StatefulSets.

Read more

Benchmarking an 18 Terabyte YugabyteDB Cluster with High Density Data Nodes

For ever-growing data workloads such as time series metrics and IoT sensor events, running a highly dense database cluster where each node stores terabytes of data makes perfect sense from a cost efficiency standpoint. If we are spinning up new data nodes only to get more storage-per-node, then there is a significant wastage of expensive compute resources. However, running multi-terabyte data nodes with Apache Cassandra as well as other Cassandra-compatible databases (such as DataStax Enterprise) is not an option.

Read more

Apache Cassandra Architecture Fundamentals

What is Apache Cassandra?

Apache Cassandra is a distributed open source database that can be referred to as a “NoSQL database” or a “wide column store.” Cassandra was originally developed at Facebook to power its “Inbox” feature and was released as an open source project in 2008. Cassandra is designed to handle “big data” workloads by distributing data, reads and writes (eventually) across multiple nodes with no single point of failure.

Read more

How Does the Raft Consensus-Based Replication Protocol Work in YugabyteDB?

Editor’s note: This post was originally published August 8, 2018 and has been updated as of May 28, 2020.

As we saw in ”How Does Consensus-Based Replication Work in Distributed Databases?”, Raft has become the consensus replication algorithm of choice when it comes to building resilient, strongly consistent systems. YugabyteDB uses Raft for both leader election and data replication. Instead of having a single Raft group for the entire dataset in the cluster,

Read more

YugaByte Company and Database Update – Aug 3, 2018

$16 Million Funding Round

In case you missed the news earlier this Summer, YugaByte raised an additional $16M of funding from Dell Technologies Capital and our previous investor Lightspeed Venture Partners. With the additional funding, we are accelerating investments in engineering, sales, and customer success to scale our support for enterprises building business-critical applications in the cloud. So, as you’d expect…

We are Hiring!

Current open positions in Sunnyvale,

Read more

How Does Consensus-Based Replication Work in Distributed Databases?

Editor’s note: This post was originally published August 2, 2018 and has been updated as of May 26, 2020.

Whether it be a WordPress website’s MySQL backend or Dropbox’s multi-exabyte storage system, data replication is at the heart of making data durable and available in the presence of hardware failures such as machine crashes, disk failures, network partitions, and clock skews. The basic idea behind replication is very simple: keep multiple copies of data on physically isolated hardware so that one hardware failure does not impact the others;

Read more

Get started in any cloud, container or data center