Distributed SQL Tips and Tricks – March 21, 2022
Welcome back to our distributed SQL tips and tricks blog! I have the pleasure of recapping distributed SQL questions from around the Internet.
How is data corruption handled in YugabyteDB compared to PostgreSQL?
Fundamentally, both PostgreSQL and YugabyteDB have their pros and cons as the storage engines are extremely different. With YugabyteDB you will get the consistency of PostgreSQL, but with much stronger resiliency. We’re resilient to all manner of issues such as server panics, node failures, and disk corruption. The database as a unit is tolerant of a single disk failure or corruption because it’s running on multiple servers. If needed, you can take down a bad server and bring on a new one, all while serving active traffic. A large automotive manufacturer was recently able to keep serving 2m ops/s despite a server panic.
Regarding disk corruption specifically, the first major difference with PostgreSQL is that we use LSM Tree / SST Files, which are sequential write append only. The random updates happen in memory only (the first level of LSM Tree is a MemTable) which flushes to a SST file, and SST Files compact into new SST files. This reduces the risk of corruption and guarantees that new writes cannot corrupt previous data. In PostgreSQL, Heap Tables and B-Tree update with random reads which can go to any block. In this case if a corruption occurs at the storage level, it may corrupt previous data because it is in the same block. Because of this there is a need to check the whole database frequently. If corruption occurs on a block at a time older than the backup retention, there’s no possibility to get it back.
Dealing with managed services
If you are using managed services for your Postgres—such as RDS—there is a chance that the same corrupt block is on the standby due to the way RDS replicates with storage sync rather than WAL. With YugabyteDB the SST files, once written, are not altered. As a result, new changes cannot corrupt the past data. Additionally, YugabyteDB checks data validity as a part of compactions, which get triggered as a part of new writes to the database. All nodes perform compactions independently. That gives them all a chance to verify data integrity as they read the data by verifying checksums. As YugabyteDB’s replication is at a higher layer (the logical key-value changes in the raft group), another tablet peer has the right data (the probability to have corruption on two different physical writes is very low) and the corrupt one can be discarded.
What is the best way to control the total number of shards in my YugabyteDB cluster ?
You can control the number of shards—or tablets in YugabyteDB—using multiple methods depending on the level of control you want. The flag ysql_num_tablets controls the number of tablets per YSQL table. Default value is -1. If its value is not set then the value defaults to ysql_num_shards_per_tserver. The difference between the two is that the former controls the total number of tablets for the table, whereas the latter controls the number of tablets per tserver. For example:
will store the table into a single tablet, whereas if the cluster consists of 3 nodes, where each has a tserver, and:
then the table will be split into 3 tablets since there are 3 tservers. So, in this case, the number of tservers determine the tablet count.
If the user uses the SPLIT INTO syntax during the CREATE TABLE DDL statement then it overrides the values of the flags specified above. If required after the fact you can always manually split any existing tablets as well. Additionally, YugabyteDB supports dynamic auto-splitting which handles this for you as your data grows. This feature can be tracked via Github here.
Remember, since indexes are additional tables in YugabyteDB this applies to them as well. For YCQL, the syntax is different and YCQL specific, however, the same process applies. You can see this directly in the code called out here. Such is the beauty of open-source.
Is there a way to enable low latency reads for my cluster without adding servers via read replicas ?
There are multiple ways to extend your reads to regions local to your users with YugabyteDB. As you call out above, read replicas are one of them, however this requires you to spin up additional infrastructure. Adding servers means paying more money and many companies do not have this flexibility. In such cases, users can enable follower reads to allow for low-latency reads from the primary cluster. As we call out in our docs:
By default, YugabyteDB requires reading from the leader to read the latest data. However, for applications that don’t require the latest data, or are working with unchanging data, the cost of contacting a potentially remote leader to fetch the data may be wasteful. Your application may benefit from better latency by reading from a replica that is closer to the client.
Although read replicas are typically used by monolithic applications to accomplish something similar, they require additional infrastructure and use asynchronous replication. Follower reads execute on the primary cluster. Since data in this cluster replicates synchronously, it allows you to read the right value as soon as the data is written to disk. However, please keep in mind that because data changes are still replicated from the leader, there is a chance of stale reads. Similar to the use of read replicas, this will not work for applications that require absolute correctness of data. Setup and examples of follower reads can be found here.
New documentation, blogs, and tutorials
Outside of the Yugabyte blogs called out below, you can also check out our Yugabyte DEV Community Blogs here.
- Performing Flexible IO Testing in YugabyteDB
- Three Reasons Why I Joined Yugabyte APJ
- Distributed SQL Deep Dive: Inside YugabyteDB’s Two-Layer Architecture
- Announcing YugabyteDB 2.12: Seamless Security and Better Manageability
- Yugabyte Partners with HashiCorp Vault for Centralized Secrets Management
- Better Developer Experience: Getting Started with YugabyteDB on Gitpod
- Database Connection Management: Exploring Pools and Performance
- YugabyteDB Community Spotlight – February 2022
- Cloud Native Java: Integrating YugabyteDB with Spring Boot, Quarkus, and Micronaut
- TPC-C Benchmark: Scaling YugabyteDB to 100,000 Warehouses
- YugabyteDB and Apache Superset: Explore and Visualize Open Source Data
- Linux Performance Tuning: Dealing with Memory and Disk IO
- YSQL Exercises: Learning Path, 7 self-paced mini-courses
- Hands-On Cloud Native GraphQL: Building a Scalable and Resilient API Layer
- YugabyteDB Friday Tech Talks: Episode 10: YSQL Collation Support
- YugabyteDB Friday Tech Talks: Episode 9: yb_hash_code() built-in function in YSQL
- From Postgres and MongoDB to the Yuga of Cloud Native Databases
- YugabyteDB Friday Tech Talks: Episode 8: Generalized Inverted Indexes in YugabyteDB
- Do My Microservices Need a Heterogeneous Data Layer?
- YugabyteDB Friday Tech Talks: Episode 7: YSQL Follower Reads
- YugabyteDB Friday Tech Talks: Episode 6: Multi-Region Deployment Options
- Distributed SQL Summit 2021 – all videos!
Upcoming events and training
- YugabyteDB Friday Tech Talks: Business Continuity – Backup/Restore in YugabyteDB
- Nordic PGDay 2022
- Yugabyte Cloud Basics: Explore Effortless Distributed SQL at Limitless Scale
- Distributed SQL Summit Asia
- Florida Topgolf Event
- Google Data Cloud Summit 2022
- Postgres Conference Silicon Valley 2022
- CIO Visions Summit, Orlando
- AWS Summit San Francisco
- CIO Visions Summit, North America
- KubeCon + CloudNativeCon Europe 2022
- Cloud Expo Asia, Singapore
- KubeCon + CloudNativeCon North America 2022
- CIO Visions Summit, Chicago
- Yugabyte University Calendar: Free Virtual Training and Builder Workshops
Ready to start exploring YugabyteDB features?
You have great options: run locally on your laptop (Quick Start), deploy to your favorite cloud provider (Multi-node Cluster Deployment), or sign up for a free Yugabyte Cloud cluster. It’s easy! Start today!