Start Now

How to Migrate Data from Cassandra or MySQL to YugabyteDB?

If you work with databases, at some point you are going to need to get data in and out them using a format that can be consumed by a completely different system. YugabyteDB makes use of CSV files to make this as easy as possible. The CSV format is arguably the most universally portable way to get data migrations accomplished.

TL;DR – YugabyteDB makes use of Cassandra’s COPY FROM command and a forked version of Cassandra’s Bulk Loader to get data into the system.

Read more

YugaByte Database Engineering Update – Nov 27, 2018

Lots has happened since our last engineering update about 3 months ago. Below are some of the highlights.

PostgreSQL API Updates & PostgresConf Silicon Valley Wrap-Up

We have made a lot of progress on YSQL, the PostgreSQL compatible distributed SQL API for YugabyteDB! You can also read about YSQL architecture which covers how distributed SQL is implemented in YugabyteDB.

We were at the first ever PostgresConf Silicon Valley in October 2018.

Read more

Apache Cassandra: The Truth Behind Tunable Consistency, Lightweight Transactions & Secondary Indexes

ACID transactions were a big deal when first introduced formally in the 1980s in monolithic SQL databases such as Oracle and IBM DB2. Popular distributed NoSQL databases of the past decade including Apache Cassandra initially focused on “big data” use cases that did not require such guarantees and hence avoided implementing them altogether. Our post, “A Primer on ACID Transactions: The Basics Every Cloud App Developer Must Know” details the various types of ACID transactions (single key,

Read more

Benchmarking an 18 Terabyte YugabyteDB Cluster with High Density Data Nodes

For ever-growing data workloads such as time series metrics and IoT sensor events, running a highly dense database cluster where each node stores terabytes of data makes perfect sense from a cost efficiency standpoint. If we are spinning up new data nodes only to get more storage-per-node, then there is a significant wastage of expensive compute resources. However, running multi-terabyte data nodes with Apache Cassandra as well as other Cassandra-compatible databases (such as DataStax Enterprise) is not an option.

Read more

Apache Cassandra Architecture Fundamentals

What is Apache Cassandra?

Apache Cassandra is a distributed open source database that can be referred to as a “NoSQL database” or a “wide column store.” Cassandra was originally developed at Facebook to power its “Inbox” feature and was released as an open source project in 2008. Cassandra is designed to handle “big data” workloads by distributing data, reads and writes (eventually) across multiple nodes with no single point of failure.

Read more

How Does Consensus-Based Replication Work in Distributed Databases?

Editor’s note: This post was originally published August 2, 2018 and has been updated as of May 26, 2020.

Whether it be a WordPress website’s MySQL backend or Dropbox’s multi-exabyte storage system, data replication is at the heart of making data durable and available in the presence of hardware failures such as machine crashes, disk failures, network partitions, and clock skews. The basic idea behind replication is very simple: keep multiple copies of data on physically isolated hardware so that one hardware failure does not impact the others;

Read more

A Quick Guide to Secondary Indexes in YugabyteDB

When creating a Cassandra-compatible YCQL table in YugabyteDB, you are required to create a primary key consisting of one or more columns of the table. Primary key based retrievals are efficient because YugabyteDB automatically indexes/organizes the data by the primary key. However, there are many use cases where you may need to retrieve data using columns that are not a part of the primary key. This is where secondary indexes help.

Read more

DynamoDB vs MongoDB vs Cassandra for Fast Growing Geo-Distributed Apps

Amazon DynamoDB is a popular NoSQL database choice for mid-to-large enterprises. In this post, we look beyond Amazon’s marketing claims to explore how well DynamoDB satisfies the core technical requirements of fast growing geo-distributed apps with low latency reads, a common use case found in today’s enterprises. We examine the development, operational and financial consequences of working around the limitations of DynamoDB when attempting to “force-fit” for this use case. Finally, we compare and contrast alternatives such as MongoDB,

Read more

YugabyteDB 1.0 — A Peek Under The Hood

Modern user-facing apps, like E-Commerce and SaaS, frequently require features from multiple databases (broadly — SQL, NoSQL and a cache) to support their multi-workload needs. App developers are responsible for understanding and managing which pieces of data should be stored in which SQL and NoSQL database. Furthermore, the app is also responsible for moving data across the tiers (e.g. populating the cache on reads and invalidating it on writes). This greatly increases development and operational complexity,

Read more

Announcing YugabyteDB 1.0! 🍾 🎉

Team YugaByte is delighted to announce the general availability of YugabyteDB 1.0!

It has been an incredibly satisfying experience to, in just two years, build and launch a cloud-scale, transactional and high-performance database that’s already powering real-world production workloads. I wanted to take a moment to share our journey to 1.0 and the road ahead.

The Inspiration

Modern user-facing applications are increasingly moving to a multi-region,

Read more

Get started in any cloud, container or data center