Start Now

The Distributed SQL Blog

Thoughts on distributed databases, open source, and cloud native

YugaByte Database Engineering Update – Nov 27, 2018

Lots has happened since our last engineering update about 3 months ago. Below are some of the highlights.

PostgreSQL API Updates & PostgresConf Silicon Valley Wrap-Up

We have made a lot of progress on YSQL, the PostgreSQL compatible distributed SQL API for YugabyteDB! You can also read about YSQL architecture which covers how distributed SQL is implemented in YugabyteDB.

We were at the first ever PostgresConf Silicon Valley in October 2018.

Read more

Data Modeling Basics – PostgreSQL vs. Cassandra vs. MongoDB

Application developers usually spend considerable time evaluating multiple operational databases to find that one database that’s best fit for their workload needs. These needs include simplified data modeling, transactional guarantees, read/write performance, horizontal scaling and fault tolerance. Traditionally, this selection starts out with the SQL vs. NoSQL database categories because each category presents a clear set of trade-offs. High performance in terms of low latency and high throughput is usually treated as a mandatory requirement and hence is expected in any database chosen.

Read more

Distributed Backups in Multi-Region YugabyteDB Clusters

Our post Getting Started with Distributed Backups in YugabyteDB details the core architecture powering distributed backups in YugabyteDB. It also highlights a few backup/restore operations in a single region, multi-AZ cluster. In this post, we perform distributed backups in a multi-region YugabyteDB cluster and verify that we achieve performance characteristics similar to those observed in a single region cluster.

We configured a 9 node cluster with 3 availability zones across 2 regions and repeated the benchmark introduced in the post.

Read more

Getting Started with Distributed Backups in YugabyteDB

YugabyteDB is a distributed database with a Google Spanner-inspired strongly consistent replication architecture that is purpose-built for high availability and high performance. This architecture allows administrators to place replicas in independent fault domains, which can be either availability zones or racks in a single region or different regions altogether. These types of multi-AZ or multi-region deployments have the immediate advantage of guaranteeing organizations a higher order of resilience in the event of a zone or region failure.

Read more

Presto on YugabyteDB: Interactive OLAP SQL Queries Made Easy

Presto is a distributed SQL query engine optimized for OLAP queries at interactive speed. It was created by Facebook and open-sourced in 2012. Since then, it has gained widespread adoption and become a tool of choice for interactive analytics. It supports standard ANSI SQL, including complex queries, aggregations, joins, and window functions. It has a connector architecture to query data from many data sources such as SQL and NoSQL databases as well as traditional big data platforms such as Hive/Hadoop.

Read more

Are MongoDB’s ACID Transactions Ready for High Performance Applications?

Web app developers initially adopted MongoDB for its ability to model data as “schemaless” JSON documents. This was a welcome relief to many who were previously bitten by the rigid structure and schema constraints of relational databases. However, two critical concerns that have been a thorn on MongoDB’s side over the years are that of data durability and ACID transactions. MongoDB has been taking incremental steps to solve these issues leading to the recent 4.0 release with multi-document transaction support.

Read more

YSQL Architecture: Implementing Distributed SQL in YugabyteDB

In this post, we will look at the architecture of YSQL, the PostgreSQL-compatible distributed SQL API in YugabyteDB. We will also touch on the current state of the project and the next steps in progress. Here is a quick overview:

  • YugabyteDB has a common distributed storage engine that powers both SQL and NoSQL
  • For supporting NoSQL apps, YugabyteDB is designed for low latency, sub-millisecond reads and massive write scalability. It can handle millions of requests and many TBs of data per node with linear scalability and high resilience.

Read more

Introducing YSQL: A PostgreSQL Compatible Distributed SQL API for YugabyteDB

YugaByte’s mission from day one has been to simplify operational database infrastructure. We are doing so by bringing together the best aspects of SQL and NoSQL into a single transactional, high-performance database. I am pleased to announce a key milestone in our mission with the formal introduction of YSQL, YugabyteDB’s PostgreSQL-compatible distributed SQL API, as part of the recent 1.1 release. YSQL (currently in beta) becomes the third member of our multi-API family that previously included two NoSQL APIs,

Read more

Apache Cassandra: The Truth Behind Tunable Consistency, Lightweight Transactions & Secondary Indexes

ACID transactions were a big deal when first introduced formally in the 1980s in monolithic SQL databases such as Oracle and IBM DB2. Popular distributed NoSQL databases of the past decade including Apache Cassandra initially focused on “big data” use cases that did not require such guarantees and hence avoided implementing them altogether. Our post, “A Primer on ACID Transactions: The Basics Every Cloud App Developer Must Know” details the various types of ACID transactions (single key,

Read more

Google Spanner vs. Calvin: Is There a Clear Winner in the Battle for Global Consistency at Scale?

Prof. Daniel Abadi, lead inventor of the Calvin transaction management protocol and the PACELC theorem, wrote a thought-provoking post last month titled “NewSQL database systems are failing to guarantee consistency, and I blame Spanner”. The post takes a negative view of software-only Google Spanner derivative databases such as YugabyteDB and CockroachDB that use Spanner-like partitioned consensus for single shard transactions and a two phase commit (2PC) protocol for multi-shard (aka distributed) ACID transactions.

Read more

Get started in any cloud, container or data center