5 (and a Half) Scenarios and the Multi-Tenancy Options That Support Them

March 5, 2024

Companies developing multi-tenant applications have to weigh the benefits and challenges (pros and cons, if you will) of separating tenants (i.e. their app’s end users) across their database. Multi-tenancy is a common strategy among Independent Software Vendors (ISVs) offering SaaS services, enterprises providing database-as-a-service (DBaaS) to different business units, or firms creating sandboxes for development and testing. The key for multi-tenancy implementations is determining the degree of isolation needed and how this isolation meets specific requirements. Not taking these considerations into account can lead to problems such as the “noisy neighbor,” cross-tenant data contamination (where one tenant can see or change another tenant’s data), or regulatory compliance violations when data privacy is non-negotiable. The concept of “good fences make good neighbors” aptly highlights the importance of proper isolation in these scenarios.

The “noisy neighbor” problem in multi-tenant environments happens when one tenant’s excessive use of shared resources degrades the performance of others. This can lead to slower response times, service outages, or reduced system stability.

What is multi-tenancy?

Multi-tenancy (or a multi-tenant architecture) allows a single instance of an app and its infrastructure to serve multiple customers. Users share both the application and the database, but their data remains isolated and invisible to others. “Tenants” are the various users of an application, which could be individuals, companies, or organizations, depending on the application vendor’s definition. For instance, Company A might identify a “tenant” by a user_id, while Company B uses an org_id encompassing several user_ids. Both approaches align with our understanding of a “tenant” in this context.

Multi-Tenancy Options in YugabyteDB

Let’s examine the different multi-tenancy options in YugabyteDB, focusing on how it manages tenant isolation and the shared database components.

The table below shows how shared and isolated components build upon each other. For example, if you choose the “Separate Tenants by Row” (4th row below) strategy also involves sharing the schema (row #3) and “Named Database (row #2).

Separate Tenants By:	Shared Components	Isolated Components	Tenant Limits*
Cluster	None	All resources are isolated, including the user and cluster management (yb-master and yb-tserver)	Technically unlimited (primarily limited by cost)
Named Database	Infrastructure, user credentials, roles	PG catalog, schema, tables (through colocation), indexes, tablespaces, parameters, users, backups	<100 tenants per cluster
Schema (single database, separate schema)	Database, PG catalog, connection	Tables (through colocation), indexes, tablespaces, schema management and customization (and therefore app versioning)	<100 tenants per cluster
Row (single database, shared schema)**	Tables, indexes	Access privileges (through row-level security)***	Unlimited (can scale clusters as needed)
Table partition ****	Tables, indexes	Access privileges (through row-level security)***, data placement (through tablespaces and partitions)	<1000 tenants per cluster

Please Note:
* The Tenant Limits column provides our recommended tenant numbers for each scenario, serving as general suggestions rather than hard limits. Your actual limit may vary based on factors like cluster size, shard count, connection quantity, and general query patterns.
** The YCQL API supports the Separate Tenants by Row strategy only due to its lack of support for colocation and table partitioning. Therefore, scaling multi-tenant solutions using YCQL with a Separate Tenants by Database or Schema strategy can lead to a significant increase in tablets (shards), which may not be cost-effective if there are a high number of tenants. For most multi-tenancy scenarios, we recommend using the YSQL API, unless a Separate Tenants by Rowstrategy fits your needs or you have a small number of tenants per cluster.
*** The YCQL API does not support row-level security. Therefore, you cannot control user access privileges if using the Separate Tenants by Row strategy.
****The Separate Tenants by Table Partition strategy is an extension of the Tenants by Row strategy.

YugabyteDB Multi-Tenancy Options — Scenarios for Use

So now that we have examined YugabyteDB’s different multi-tenancy options (and how they handle tenant isolation and share database components), let’s examine these different multi-tenancy options in terms of different scenarios that companies might face.

Today many companies are developing fully-managed SaaS solutions as replacements for their traditional self-managed ones. This transition has brought multi-tenancy to the forefront, prompting a discussion on when to use what. Below are the five (and a half) strategies that we have explored and the recommendations we have for each one

Run different application versions for each tenant and isolate each tenant’s data
Consider separating tenants by database or schema
To maximize the separation between tenants, both in terms of versioning and data, separate them by database or schema (whichever meets your requirements better). This can provide deeper levels of data separation and allow for customized schemas per tenant. This approach does have its challenges, such as having to manage more databases or schemas and the extra effort needed to create new database objects for each tenant. Separating tenants by database adds additional complexity, requiring more work on the application layer to ensure tenant connections are correctly mapped and to aggregate data for analysis. However, this method enhances security by preventing shared database connections. Additionally, separating tenants by database simplifies disaster recovery processes and makes it easier to migrate larger tenants to their own clusters.
In either case, enabling colocation is advisable to minimize the number of tablets, preventing increased resource usage as more tenants are onboarded. You can keep smaller tables colocated while splitting up the larger tables that require scalability. However, separating tenants by schema will collocate any tenant data that is not split onto a single tablet, lessening data separation. For the maximum amount of data separation, consider separating tenants by database.
Separating tenants by database makes sense in the following five scenarios.
1. Maximum security restrictions between tenants (connections, data, privileges) are needed
2. Schema is customizable per tenant
3. The backup and restore process for specific tenants is less complicated
4. There are no requirements to combine data across tenants (unless on a downstream system)
5. You are willing to take on the operational burden (potential app layer work and additional maintenance activities) to achieve these benefits.
Separating tenants by schema makes sense in the following four scenarios.
1. A high degree of data isolation between tenants (data, privileges) is required
2. Schema is customizable per tenant
3. You need to combine data across tenants
4. You are willing to take on additional maintenance activities across tenants
Manage application releases simultaneously for all tenants, but control the data placement on a per-tenant basis.
Consider separating tenants by table partition
To manage applications across all tenants while controlling the data placement, consider separating tenants using table partitions. This method, based on the PostgreSQL tablespaces concept, lets you specify which nodes host each tenant’s data, offering both management ease and a degree of data isolation between tenants. However, this doesn’t guarantee physical data separation, especially if you want to have multiple tenants per node to be more cost-effective. In addition, from a security perspective, any authenticated user who can access the database has full access to all the data in the table. So implementing row-level security (RLS) is advisable to better secure user data. Finally, you do have the option to isolate certain tenants, with the remainder having their data grouped across tablets.
Separating tenants by table partition makes sense in the following five scenarios.
1. There are a large number of tenants, but certain ones require stricter data separation.
2. Flexibility is needed to manage application releases simultaneously across tenants.
3. There are no requirements for separate application versions across tenants.
4. You need to control data placement for each tenant.
5. There are no strict security requirements (although this last one does have a workaround – to a degree – with RLS).
The primary goal is the lowest possible management overhead (with or without row-level security). Data isolation is not a strict requirement.
Consider separating tenants by row
If the main priority is to lower overhead as much as possible when managing and onboarding new tenants, then separating tenants by row is the best choice. Managing all tenants in a single table simplifies the introduction of a new tenant. Just insert a row into the table, eliminating additional processes around database and schema provisioning when onboarding new tenants. It also enables the management of a single application version across all of your tenants, simplifying management even further. Depending on security needs, this can be implemented with or without row-level security (RLS).
Separating tenants by row (with or without row-level security should be used to:
- Achieve the lowest management overhead when onboarding new tenants
- Simplify access controls
- Run a single application version for all tenants (that does not require application customizations for any tenants)
- Lack the need for data separation for tenants
- Navigate complex backup and recovery per tenant (if willing)
Use the same application version across tenants and ensure simplified onboarding and management (with row-level security).
Consider separating tenants by row, with or without row-level security
If you’re looking to simplify how users connect to the database but you want to control their level of access, you can separate tenants using row-level security. You define what data users can (or cannot) access within a table, typically ensuring tenants only see their own data. This method adds a security layer, preventing accidental cross-tenant data visibility due to any application logic errors. With all users in one table, there is a single schema to manage, so you can run a single version of the app across all tenants. However, since human error can impact row-level security (RLS) implementation, there’s no guarantee tenants will only access their data if the RLS policy is inaccurately set.*
Separating tenants by row (using row-level security) makes sense in the following four scenarios.
- There are simplified access controls.
- You want to manage a single schema or run a single app version for all tenants (and no tenants require app customizations).
- There is no requirement to separate data for tenants.
- You are willing to navigate the complex backup and recovery process per tenant.
* Separating your tenants by row leaves you susceptible to application logic gaps that may result in a tenant seeing another tenant’s data. To protect yourself from this, we highly encourage you to set up row-level security to control what users can see.
Most tenants don’t require strict isolation or much customization; however, some tenants will want stricter isolation
Consider a combination of “Separate Tenants by Row (With or Without Row-Level Security)” and “Separate by Database”
For a tiered-based tenant approach (meaning there are different requirements across tenants), combining these two strategies is effective. Typically, most tenants might not need data isolation or customization, while a select few (like larger companies or those with specific compliance needs) may require stricter data isolation or customizations. For the majority, separating tenants by row works well, whereas special cases benefit from database or schema separation. This strategy leverages the advantages of each method according to tenant needs. However, each option has its challenges, which you can read about in the scenarios highlighted above, such as the extra effort needed to integrate data across different databases for analysis. For example, if you are providing a specific tenant with their own database, you have to be aware that it will require additional work to combine their data with that of other tenants for analysis.
A combination of Separate Tenants by Row (with or without RLS) and Separate by Schema/Database makes sense to:
- Lower management overhead for the majority of tenants.
- Allow special tenants to receive stricter data isolation and/or customizations.

Extra 1/2 Scenario: Separating Tenants by Cluster

There is one more scenario — Separating Tenants by Cluster — that I did not cover in this blog because it’s not truly multi-tenant for most users; however, this is certainly something you can do. For example, Tier 1 tenants may have high data volumes and require their own cluster. In this case, these tenants can have their own clusters while smaller tenants reside on a single cluster.

In Conclusion…

Many of these strategies can be combined to address different needs, just as they were for the fifth scenario. For example, larger tenants might need to be isolated into their own environments, while smaller tenants can be combined into a shared environment with finer-grained isolation guardrails.

March 5, 2024

Database Architecture Multi-tenancy

5 (and a Half) Scenarios and the Multi-Tenancy Options That Support Them

Related Posts

Explore Distributed SQL and YugabyteDB in Depth