Measuring the Performance Impact of TLS Encryption Using TPC-C
Organizations need to protect their data, especially the personal data entrusted to them from their users and customers. In order to do so, the data transferred by a database over the network needs to be secure. This is often accomplished using TLS encryption, an encryption protocol that secures communication across a network. When secured by TLS, a communication between a client and a server can enable the two parties to identify one another (preventing any impersonation), the communication remains private (ensuring the contents of the communication cannot be sniffed by anybody else), and the integrity of data being exchanged is preserved (ensuring nobody can alter it while in transit).
In the case of a distributed SQL database like YugabyteDB, there are two main paths to secure:
- Client to server communication where applications connect to the database servers
- Server to server communication where the database servers talk to one another to perform distributed query processing and transparent data replication
When TLS encryption is enabled one of the major concerns application developers have is how it will affect either of those two paths and introduce latencies. In this blog post we will run TPC-C benchmarks against a YugabyteDB cluster with and without TLS encryption enabled. We are doing this to understand how enabling TLS encryption between the application and the YugabyteDB cluster impacts performance.
Enabling TLS does not have a large impact on the overall performance, as reported by running TPC-C benchmarks. When testing a number of different warehouses, the efficiency and CPU for a YugabyteDB cluster with TLS enabled and one without TLS enabled were within 5% of each other. Here are the performance results with TLS relative to without.
- Throughput drops by < 5%, which is marginal.
- Average latency increases by 15%. However, note that there can be up to a 10% variance between any two runs in the cloud.
- CPU increases by roughly 6% when enabling TLS, again within the margin of error in the cloud.
What this means to you as a user is that when you decide to secure your communication between nodes and the application, there is no significant impact to performance unless you have a very high connection count where latency can increase a bit. With TLS encryption enabled, you can still have lightning fast responses between nodes and applications but also have confidence that your communication is secure.
Deploy YugabyteDB in a single AWS region:
- We will create a 3 node cluster
- Each node is a
c5d.4xlargeinstance type (16 vCPUs, 32 GiB RAM)
- Storage is 400GB SSD on gp2 EBS
- Run the TPC-C benchmark in two different scenarios against this multi-region setup
- Scenario 1: Run the TPC-C benchmark with TLS enabled using 1500 warehouses
- Scenario 2: Run the TPC-C benchmark with TLS disabled using 1500 warehouses
./tpccbenchmark --create=true --load=true --execute=true --nodes=$NODES --warehouses=1500 --loaderthreads 48
- Analyze the performance of each topology
The following was installed on the three nodes:
- The instances were created with the centos-7 AMI
- YugabyteDB v2.5.1 was installed on the nodes to create a cluster
- A separate benchmark node of type
c5.4xlargewas used to run TPC-C
The YugabyteDB clusters that are being used are 3 node clusters with RF 3 in AWS using c5d.4xlarge instance types.
Results: 1500 Warehouses
What you’ll see below is that the performance between the TLS enabled and TLS disabled clusters is very close.
|TLS Disabled||TLS Enabled||% Drop in Perf|
|Throughput||673.4 requests/sec||649.3 requests/sec||-3.6%|
Avg Latency: 1069.8 ms
P99 Latency: 18053.0 ms
Avg Latency: 1281.9 ms
P99 Latency: 24133.3 ms
Avg Latency: -20%
P99 Latency: -33%
Overall the performance between each other is close, but the TLS disabled cluster was a few percentage points better. The main differences can be seen in the latencies for each workload, where at maximum, we see a 26% decrease in average latency and 48% decrease in p99 latencies for the stock level workload. However, the efficiency between each test was under 5%.
For visual reference the below graphs show the performance for each run.
With TLS enabled there is no major difference in performance compared to a cluster without TLS enabled. In either case, 100 or 1000 warehouses both were within 1-2% of each other on CPU and less then 1% on overall efficiency. In the case of 1500 warehouses we did see a drop in CPU efficiency or throughput on the TLS enabled cluster. This was because the TLS process was adding latency as expected to each operation and hence the work done by the CPU was lesser overall. Because of the increased latency on each operation we did see an increase of ~20% overall in latencies. However, we also see that even with these latencies we don’t see a large drop in throughput or efficiency of the workloads. If you have any questions feel free to reach out to us on our community Slack channel yugabyte.com/slack.