Securing YugabyteDB: Evaluating and Selecting the Right Security Tool

Securing the Infrastructure Behind Our Distributed DBaaS

Bharat Kumar Mukheja

As we embark on our mission at Yugabyte to create the most secure DBaaS available, we’re documenting our comprehensive journey in this blog series. Our first installment focused on establishing our requirements and making crucial storage and cost estimations for the infrastructure of our fully managed DBaaS (YugabyteDB Managed). Now, in this second blog, we turn our attention to the evaluation and deployment process.

Read Part One of Our SIEM / SOAR Quest >> 

SIEM Evaluation Overview

Taking our requirements, we conducted an extensive evaluation of several popular security tools (see table below). We selected the self-hosted version of the open-source SIEM/SOAR tool: Wazuh.

Authenticated CVE ScannerMalware DetectionFIM for Config.IDSOS-User Activities Monitoring
Crowdstrike FalconYesYesYesMaybeMaybe
Datadog AgentNoNoYesNoNo
Wazuh (Host IDS)YesYesYesYes (Host-based IDS)Yes
Tenable NessusYesNoNoNoNo
Security Onion – Network IDSYesYesYesYesYes

Utilizing Wazuh as the SIEM/SOAR Solution


The high-level architecture of a Wazuh deployment is shown in the image above. Here is a brief rundown to help explain that image and why we selected them during the evaluation process.

  • Wazuh works simultaneously in server-client architecture and data-pull architectures. It calls the server-client side, Wazuh Agents, and the data-pull architecture, Agentless integration.
  • Wazuh Agents monitor any endpoints or hosts that support the installation of the agent binary. All prominent operating systems are supported.
  • Agentless integration pulls logs from cloud service providers, including Google Cloud, AWS, and Azure.
  • Agentless and agent-based monitoring combined makes Wazuh the all-around SIEM solution to cover our whole infrastructure.


Wazuh offers built-in active response scripts triggered by a variety of events, and it also allows for the creation of custom scripts for added functionality. After careful consideration, we’ve decided to proceed with the built-in scripts, as we find them suitable for most scenarios.

Wazuh deployment architecture

Wazuh Deployment on YugabyteDB

Within the centralized server, we architected the infrastructure as follows:

  • Kubernetes-based deployment on the Google Kubernetes Engine (GKE)
  • Separate node pools for Wazuh Manager and Wazuh Indexers
  • Persistent storage for log retention
  • Terragrunt used for Infrastructure as Code (IaC)

We leveraged Wazuh’s Docker images for its manager and workers, creating the deployment scripts needed to deploy them with native Google Cloud (GCP) technologies wherever possible. This approach involved using KMS to store secrets, configmaps to manage configuration values, load balancers to expose Wazuh endpoints, and Google-managed Prometheus to monitor GKE cluster health.

Data Sources

In addressing the server, collector, and dashboard aspects within the centralized server, we made significant progress, but that was only part of the journey. To extend monitoring across our production infrastructure, we undertook two essential steps:

  1. Integrated available auditing services from cloud service providers into Wazuh, using its agentless integration capabilities.
  2. Embedded Wazuh agents directly into our production database nodes, using Wazuh’s agent-based monitoring for more comprehensive oversight.

Agentless Integration

We captured AWS Cloudtrail, GCP logging, and Azure Defender logs through Wazuh’s agentless integration. This connection allows Wazuh to regularly pull the latest audit logs from these cloud providers, using their pull mechanism. These audit logs typically include IAM activity, instance activity, network logs, and Kubernetes activity, providing valuable insights into any suspicious activity.

Agent-Based Monitoring

We integrated Wazuh agents into the AMIs for our fully managed YugabyteDB offering, ensuring a basic level of security for every database virtual machine(VM). Wazuh agents are lightweight, daemon-based services that don’t interfere with the OS on failure. After extensive testing, we found them safe for installation in our machines.

Custom Data Sources

Since Wazuh uses OpenSearch software for log ingestion at its backend, it enables us to ingest custom data source logs into SIEM. As a result, we’re developing custom data sources for our applications, allowing them to feed their audit logs into our SIEM tool.  Currently, our plan involves integrating the following sources (but our list is expanding):

  • Audit logs and SSH session recordings from Zero Trust access providers.
  • User activity logs from identity management systems.
  • Activity logs from remote access tools.
  • VPN audit logs.
  • CDN (Content Delivery Network) audit logs.
  • Activity logs from password management systems.


In addition to the integrations already mentioned, we’ve also connected the logs from our internal systems’ environments to Wazuh. This will ensure that our security coverage extends across our entire, ever-expanding infrastructure. At present, our integrations include:

  • Slack and PagerDuty for alerting and notifications.
  • GKE-managed Prometheus for observability.
  • Identity Provider for SAML.

Given our testing pipeline, we experience a massive churn of agents. We currently manage around 70K Wazuh agents. Despite this large number, the system operates flawlessly, generating substantial amounts of vital data for us.

And Now to the Results…

Wazuh Performance

As mentioned, we have successfully deployed over 70,000 Wazuh agents without incurring any downtime. Their operation has been flawless, generating extensive data to help secure our infrastructure. Implementing Wazuh as an SIEM/SOAR solution for YugabyteDB proved highly effective, which will help Yugabyte tremendously as we undergo our compliance audit.

Wazuh Agent Impact on YugabyteDB Performance

We have also conducted a performance impact assessment of Wazuh Agents on YugabyteDB nodes. The tests conclude that Wazuh agents cause an insignificant CPU impact. The idle impact is near zero with real-time monitoring enabled. The worst case is when the Wazuh Agent runs a full scan, and the CPU impact is observed not more than 0.2% for a 30-second window every hour. Also, there is no impact on SQL/CQL query latency /ops/second when accounting for measurement uncertainty

Bharat Kumar Mukheja

Related Posts

Explore Distributed SQL and YugabyteDB in Depth

Discover the future of data management.
Learn at Yugabyte University
Get Started Business
Browse Yugabyte Docs
Explore docs Business
PostgreSQL For Cloud Native World
Read for Free Business