“YugabyteDB Managed” is now called “YugabyteDB Aeon”. To find out more, visit our launch blog.

How Is Data Corruption Handled In YugabyteDB Vs. PostgreSQL?

Distributed SQL Tips and Tricks Series

August 11, 2023

YugabyteDB fully utilizes the PostgreSQL query layer. Doing so allows us to enable more RDBMS features than other distributed SQL databases, even though our storage engine differs from Postgres. With YugabyteDB, you get the consistency of PostgreSQL and gain greater resiliency against issues like server panics, node failures, and disk corruption. Our database is designed to tolerate single disk failures (or corruptions) because it runs on multiple servers. If needed, you can take down a bad server and bring on a new one, all while serving active traffic. In fact, a large automotive manufacturer was recently able to keep serving 2m ops/s despite a server panic.

Regarding disk corruption specifically, YugabyteDB’s use of LSM Tree / SST Files marks a significant difference from PostgreSQL. These sequential write append-only files handle random updates in memory only, with the first level of LSM Tree being a MemTable that flushes to an SST file. SST Files then compact into new SST files, minimizing the risk of corruption and ensuring new writes won’t corrupt previous data.

In contrast, PostgreSQL’s Heap Tables and B-Tree update with random reads that can go to any block. If corruption occurs at the storage level in this case, it might corrupt previous data within the same block, necessitating frequent whole-database checks. If corruption happens on a block older than the backup retention, recovery becomes impossible, a risk that YugabyteDB’s design significantly mitigates.

Dealing with PostgreSQL managed services

If you are using PostgreSQL managed services vendor—such as Amazon RDS—there is a chance that the same corrupt block is on the standby due to the way RDS replicates with storage sync rather than WAL.

With YugabyteDB, once the SST files are written, they are not altered. As a result, new changes cannot corrupt the past data. Additionally, YugabyteDB further safeguards data by checking data validity through independent compactions performed on all nodes, verifying checksums to confirm integrity. Since YugabyteDB’s replication is at a higher layer (the logical key-value changes in the raft group), another tablet peer likely has the right data (the probability of having corruption on two different physical writes is very low) and the corrupt one can be discarded.

Ready to start exploring YugabyteDB features? You have some great options to get started. Run the database locally on your laptop (Quick Start), deploy it to your favorite cloud provider (Multi-node Cluster Deployment), sign up for a free YugabyteDB Managed cluster, or request a full-featured trial. It’s easy! Get started today!

August 11, 2023

How Is Data Corruption Handled In YugabyteDB Vs. PostgreSQL?

Dealing with PostgreSQL managed services

Discover More Tips and Tricks

Events and Training

If You Have Questions About Distributed SQL

Next Steps

Explore Distributed SQL and YugabyteDB in Depth

How Is Data Corruption Handled In YugabyteDB Vs. PostgreSQL?

Related Posts

Explore Distributed SQL and YugabyteDB in Depth