Distributed SQL Tips and Tricks – November 2nd, 2021

November 2, 2021

Welcome back to our tips and tricks blog series! With October officially a wrap, I have the pleasure of recapping some distributed SQL questions from around the Internet. There was no Tips and Tricks edition for September as the Yugabyte team was focused on our 2021 Distributed SQL Summit. If you did not have a chance to check it out, you can view the sessions on-demand.

As always, this blog series would not be possible without all of the behind the scenes and community work done by YugabyteDB members such as Dorian Hoxha, Franck Pachot, and Frits Hoogland. All of us here at Yugabyte also have to thank our incredible user community for their work with the technology—and not being afraid to ask questions.

Do you have questions yourself? Make sure to ask them on our YugabyteDB Slack channel, Forum, GitHub, or Stack Overflow. For previous Tips & Tricks posts, check out our archives. Now let’s dive in:

What is the right approach to scaling horizontally with YugabyteDB?

In traditional database systems, scaling is often accomplished by increasing the size of your server to add additional CPU or RAM (i.e.,scaling vertically). This often leads to undersized or oversized clusters as hardware comes in set CPU/RAM sizes, which can come with much higher costs (e.g., scaling from 32 cores to 64 cores).

YugaybteDB runs across a cluster of nodes and is built to easily add nodes to the cluster rather than being forced to resort to scaling by this approach. This gives us the unique ability to take advantage of both styles of scaling: vertical and horizontal. As mentioned in our deployment checklist, we recommend running production workloads on a minimum of 16 cores. That being the case, we also understand that this may not be feasible for everyone. As many users run YugabyteDB in cluster sizes of 4 or 8 cores per node, it is important to discuss the best approach to scaling these clusters as you grow.

What we have found is that the most effective time to scale out is once you have scaled up to a minimum of 8 cores. At that point, scaling horizontally to about 6 nodes makes sense, which is when we recommend scaling up to the recommended production size of 16 cores per node. From here you can continue to scale out as needed. If you are looking to scale up your entire cluster, or complete an upgrade, take a look at our change configuration docs that walk you through how to do this with an RF=3 cluster with 6 nodes.

A key point to consider, whether you are adding additional nodes to your cluster or doing a full move, is that it is better to present all the new nodes for the system at once rather than add one node at a time and wait for data to rebalance. This ensures you are minimizing the number of times data has to be rebalanced across the cluster. That being said, in circumstances where you can do an in-place instance change, you can take the approach of doing it one node at a time.

With an RF=3 cluster, you can perform a rolling upgrade and take down one node at a time without any downtime. This will not require any data reshuffling. Be aware that temporarily your tablets will be under-replicated when you take a node down. For most use cases this will be fine as the point of this operation is to quickly bring the node back up. These processes are made simpler for Yugabyte Platform and Cloud users as these steps are automated through the console.

Should we use xCluster asynchronous replication or row-level geo-partitioning for our multi-region cluster?

In YugabyteDB there are two main types of replication: synchronous and asynchronous. By default, synchronous replication is used within a cluster to replicate data across tablets within a RAFT group. This type of replication—paired with the ability to partition a table on a row by row basis—helps us create a row-level geo-partitioning configuration. This configuration is most often used in scenarios where user data is required to stay within a particular region.

Let’s say you are a company that has been running its application in the United States, and are ready to expand into Europe. However, in the EU you have to abide by GDPR rules where your EU user data has to stay within the EU. In this case, you can partition off your USERS table by a column named geo_location—as shown in this example—in order to keep your user data in the correct geography.

Using the same scenario, let’s say that instead you want to have a completely separate cluster in that region, and you want these two clusters to replicate data on a table per table basis. In this case xCluster asynchronous replication would make more sense. On a table by table basis, you can control bidirectional and unidirectional data replication between these two clusters. For companies that need to share data across regions, xCluster asynchronous replication significantly lowers latency.

How do I keep my tablet (shard) count low?

Being a distributed SQL database, YugabyteDB automatically shards your data behind the scenes into what we refer to as tablets. Through this process we take your tables, regardless of size, and split them into ‘X’ amount of shards. These shards are then distributed across the nodes in the cluster. This helps even out the load across all of the nodes, rather than having a single node serving all reads and writes. Within YugabyteDB each node, and every additional node added, can serve reads and writes. As you continue to increase table count you also increase shard count.

As this grows to the thousands of tables you can start running into performance issues. There are a couple strategic ways to avoid this that involve lowering the overall tablet count. The first is to decrease the number of shards for a table or secondary index. Your table is broken into a number of shards dictated by the flag –yb_num_shards_per_tserver for YCQL and –ysql_num_shards_per_tserver for YSQL. For example, if you were to set the values to 1, each table would be a single shard.

The second option is to use the concept of colocated tables, in which multiple tables reside in the same shard. This option is best in use cases where you have a large number of tables with small data sizes that do not benefit from sharding across a cluster. For example, let’s say you have a total of 20 tables. 15 of those tables are less than 1GB. The remaining are anywhere from 50-300GB. In this case, you can group those 15 small tables into a single shard, while still breaking out the larger tables based on the values of the flags called out above. In both scenarios, your shards will be replicated by the value you set for your replication factor (RF), thus ensuring continuous availability. Additional information on the colocated tables can be found in this in-depth blog on the topic.

How can we make our data more secure with YugabyteDB?

YugabyteDB supports numerous ways to make sure your data, as well as access to your data, remain secure. The first security piece that comes to mind is connecting to your database. How secure do you want that connection to be? If security is top of mind for your organization, you will almost certainly want to enable client-to-server encryption for your connections. Keep in mind that as a distributed SQL system, nodes in your cluster will also have to communicate. For this we allow you to enable server-to-server encryption. TLS encryption is used for each of these methods. Besides network encryption we also support encryption at rest with user generated keys and column level encryption. Column level encryption is used when you need to further secure sensitive columns such as an address or credit card number.

Additionally, there are numerous authentication and audit logging methods you can incorporate into your strategy. This helps make sure that only the users with these credentials can access your system. Logging provides visibility into what actions users are taking within the database. For use cases that require multi-tenancy, there is a further step to consider: how do we make sure that users are only accessing their data? This is where Row-Level Security (RLS) comes into play. Using RLS, you can make sure that users within a multi-tenancy table can only view their own data. Depending on the level of Discretionary Access Control (DAC) that your organization or application requires, you can also implement Role-Based Access Control (RBAC).

Using a combination of the different encryption, authentication, and logging methods above you will be able to protect your data from numerous different vectors of attack. Have a security concern or see a missing feature? Please raise an issue on our GitHub.