Using Envoy Proxy’s PostgreSQL & TCP Filters to Collect Yugabyte SQL Statistics

Sid Choudhury

SVP, Product

Layer 7 proxies like NGINX and HAProxy have been popular since the mid-2000s. The term “proxy” refers to their role as an intermediary for the traffic between an application client and an application server. The “layer 7” classification comes from the fact that these proxies take routing decisions based on URLs, IPs, TCP/UDP ports, cookies, or any information present in messages sent over a layer 7 (aka application layer) networking protocol like HTTP and gRPC. Modern applications use these proxies for a variety of needs including load balancing, security, and web acceleration.

What is Envoy Proxy?

Envoy is a layer 7 proxy and communication bus designed for large modern service oriented architectures. Its home page has the following definition:

Originally built at Lyft, Envoy is a high performance C++ distributed proxy designed for single services and applications, as well as a communication bus and “universal data plane” designed for large microservice “service mesh” architectures. Built on the learnings of solutions such as NGINX, HAProxy, hardware load balancers, and cloud load balancers, Envoy runs alongside every application and abstracts the network by providing common features in a platform-agnostic manner. When all service traffic in an infrastructure flows via an Envoy mesh, it becomes easy to visualize problem areas via consistent observability, tune overall performance, and add substrate features in a single place.

As the recently published Dropbox’s migration from NGINX to Envoy highlights, Envoy is rapidly becoming the default proxy for cloud native applications that need higher performance, observability, extensibility, security, building and testing, and last but not least, deep features (such as HTTP/2, gRPC, and egress proxying). It was the third CNCF project to reach the graduated status, following Kubernetes and Prometheus, and has gained widespread adoption in a relatively short period of time.

Why use Envoy’s PostgreSQL filter?

Envoy supports configuration of multiple traffic listeners where each listener is composed of one or more filter chains. An individual filter chain is selected to process the incoming data based on the filter’s match criteria (which includes connection parameters such as destination port/IP, transport protocol name, source port/IP, and more). When a new connection is received on a listener, the matching filter chain is selected and instantiated. The filters then begin processing subsequent events. This generic listener architecture is used to perform the vast majority of different proxy tasks that Envoy is used for including rate limiting, TLS client authentication, HTTP connection management, raw TCP proxy, and more. One such task relevant to database deployments is the ability to instrument the wire protocol of popular databases such as MySQL, MongoDB, Kafka, and Amazon DynamoDB. PostgreSQL was missing from this list but the latest v1.15 release from July 2020 solved that problem by adding a PostgreSQL proxy filter. This filter is based on PostgreSQL frontend/backend protocol version 3.0, which was introduced in PostgreSQL 7.4.

The main goal of the PostgreSQL filter is to capture runtime statistics while remaining completely transparent to the database server. There is no additional monitoring software to deploy or manage in order to collect these vital statistics! As listed in the official docs, the filter currently offers the following features:

  • Decode non SSL traffic, ignore SSL traffic
  • Decode session information
  • Capture transaction information, including commits and rollbacks
  • Expose counters for different types of statements (INSERTs, SELECTs, DELETEs, UPDATEs, etc.)
  • Count frontend, backend, and unknown messages
  • Identify errors and backend responses

YugabyteDB is fully compatible with the PostgreSQL wire protocol and SQL syntax given that its SQL query layer is based on a fork of PostgreSQL 11.2’s query layer. As a result, YugabyteDB is able to leverage the PostgreSQL filter from Envoy without any modifications whatsoever. The rest of this post outlines the instructions to run the most basic YugabyteDB with Envoy setup (including the PostgreSQL & TCP filters) using Docker Compose. Official Envoy sandboxes use the same approach to test out different features and highlight sample configurations.

YugabyteDB with Envoy in action

Install Docker

Ensure that you have docker and docker-compose installed on your local machine. Docker Desktop can be the simplest way to achieve this goal.

Create the “YugabyteDB with Envoy” stack using docker-compose

Create a working directory

mkdir yugabyte-envoy

Create the envoy.yaml

Copy the following contents into a file named envoy.yaml.

static_resources:
  listeners:
  - name: yb_listener
    address:
      socket_address:
        address: 0.0.0.0
        port_value: 1999
    filter_chains:
    - filters:
      - name: envoy.filters.network.postgres_proxy
        typed_config:
          "@type": type.googleapis.com/envoy.extensions.filters.network.postgres_proxy.v3alpha.PostgresProxy
          stat_prefix: ysql
      - name: envoy.tcp_proxy
        typed_config:
          "@type": type.googleapis.com/envoy.extensions.filters.network.tcp_proxy.v3.TcpProxy
          stat_prefix: tcp_ysql
          cluster: yb_cluster

  clusters:
  - name: yb_cluster
    connect_timeout: 1s
    type: strict_dns
    load_assignment:
      cluster_name: yb_cluster
      endpoints:
      - lb_endpoints:
        - endpoint:
            address:
              socket_address:
                address: ysql
                port_value: 5433

admin:
  access_log_path: "/dev/null"
  address:
    socket_address:
      address: 0.0.0.0
      port_value: 8001

As we can see above, we have configured an Envoy listener on port 1999 that has a filter chain with two filters, namely PostgreSQL and TCP.

Create the envoy dockerfile

Copy the following contents into a file named Dockerfile-proxy. When built and instantiated, we will have an envoyproxy container that will use the envoy.yaml configuration we created in the previous step.

FROM envoyproxy/envoy-dev:latest

COPY ./envoy.yaml /etc/envoy.yaml
RUN chmod go+r /etc/envoy.yaml
CMD /usr/local/bin/envoy -c /etc/envoy.yaml -l debug

Create the yugabytedb dockerfile

Copy the following contents into a file named Dockerfile-yugabyte. When built and instantiated, we will have a single YugabyteDB container with the PostgreSQL-compatible YSQL API available on port 5433.

FROM yugabytedb/yugabyte:latest

CMD ["/home/yugabyte/bin/yugabyted","start","--daemon=false"]

Create the docker-compose.yaml

Copy the following contents into a file named docker-compose.yaml.

version: "3.7"
services:

  proxy:
    build:
      context: .
      dockerfile: Dockerfile-proxy
    networks:
      envoymesh:
        aliases:
          - envoy
    expose:
      - "1999"
      - "8001"
    ports:
      - "1999:1999"
      - "8001:8001"

  yugabyte:
    build:
      context: .
      dockerfile: Dockerfile-yugabyte
    networks:
      envoymesh:
        aliases:
          - ysql
    expose:
      - "5433"
    ports:
      - "5433:5433"

networks:
  envoymesh:
    name: envoymesh

Start the docker-compose stack

docker-compose pull
docker-compose up --build -d
docker-compose ps

Output from the ps command is shown below.

Name                      Command                     State                Ports                                              
---------------------------------------------------------------------------------
yugabyte_proxy_1      /docker-entrypoint.sh /bin ...   Up      ... 0.0.0.0:1999->1999/tcp, 0.0.0.0:8001->8001/tcp                                                                    
yugabyte_yugabyte_1   /home/yugabyte/bin/yugabyt ...   Up      ... 0.0.0.0:5433->5433/tcp, ...

As we can see, two containers have been spun up on a common envoymesh network.

1. yugabyte_yugabyte_1 is the YugabyteDB container that is ready to interact with PostgreSQL clients on port 5433.

2. yugabyte_proxy_1 is the Envoy proxy container that is running the PostgreSQL proxy on the 1999 port. Requests to this port get automatically redirected to the port 5433 on the YugabyteDB container.

Connect using ysqlsh via the envoy listener

We are now ready to connect to the YugabyteDB cluster using ysqlsh. However, instead of directly connecting to the 5433 port of the YugabyteDB container, we will connect to the Envoy proxy at the 1999 port.

docker run --rm -it --network envoymesh yugabytedb/yugabyte /home/yugabyte/bin/ysqlsh "sslmode=disable" -h envoy -p 1999

As highlighted in the Envoy docs, the current PostgreSQL filter decodes only non-SSL (aka unencrypted) traffic and ignores any SSL/encrypted traffic. The sslmode=disable option shown above is mandatory for Envoy to treat the PostgreSQL traffic as unencrypted even though the cluster has been set up without any encryption. Since this behavior is reproducible with both PostgreSQL 12 and 11.2 (Yugabyte SQL is based on a fork of this version), this is most likely a bug in the filter implementation.

Run basic YSQL commands

Let’s create a table.

CREATE TABLE links (
	id SERIAL PRIMARY KEY,
	url VARCHAR(255) NOT NULL,
	name VARCHAR(255) NOT NULL,
	description VARCHAR (255),
      last_update DATE
);

Now let us insert four rows into the table we created.

INSERT INTO links (url, name)
VALUES('https://www.postgresqltutorial.com','PostgreSQL Tutorial');

INSERT INTO links (url, name)
VALUES('https://www.oreilly.com','O''Reilly Media');

INSERT INTO links (url, name)
VALUES('https://docs.yugabyte.com','YugabyteDB Docs');

INSERT INTO links (url, name)
VALUES('https://blog.yugabyte.com','YugabyteDB Blog') 
RETURNING id;

We can now run a SELECT statement to get all the rows we inserted.

SELECT * FROM links;

We can also run a SELECT statement to get the count of rows we inserted.

SELECT count(*) FROM links;

Review YSQL statistics collected by Envoy’s filters

Each of the two filters configured provide us with statistics relevant to the data they observe. All these statistics are available on the Envoy stats page https://localhost:8001/stats.

Using the PostgreSQL filter

Since we gave the stats prefix as ysql, we see all the statistics with the overall prefix as postgres.ysql. The ones that we can easily verify based on the queries we executed in the ysqlsh session we created are highlighted in bold.

postgres.ysql.errors: 0
postgres.ysql.errors_error: 0
postgres.ysql.errors_fatal: 0
postgres.ysql.errors_panic: 0
postgres.ysql.errors_unknown: 0
postgres.ysql.messages: 45
postgres.ysql.messages_backend: 37
postgres.ysql.messages_frontend: 8
postgres.ysql.messages_unknown: 0
postgres.ysql.notices: 0
postgres.ysql.notices_debug: 0
postgres.ysql.notices_info: 0
postgres.ysql.notices_log: 0
postgres.ysql.notices_notice: 0
postgres.ysql.notices_unknown: 0
postgres.ysql.notices_warning: 0
postgres.ysql.sessions: 1
postgres.ysql.sessions_encrypted: 0
postgres.ysql.sessions_unencrypted: 1
postgres.ysql.statements: 7
postgres.ysql.statements_delete: 0
postgres.ysql.statements_insert: 4
postgres.ysql.statements_other: 1
postgres.ysql.statements_parse_error: 2
postgres.ysql.statements_parsed: 5
postgres.ysql.statements_select: 2
postgres.ysql.statements_update: 0
postgres.ysql.transactions: 7
postgres.ysql.transactions_commit: 7
postgres.ysql.transactions_rollback: 0

Using the TCP filter

The TCP statistics relevant to YSQL are available with the tcp.tcp_ysql prefix. As we can see, the statistics are at the network level including bytes transmitted and bytes received.

tcp.tcp_ysql.downstream_cx_no_route: 0
tcp.tcp_ysql.downstream_cx_rx_bytes_buffered: 33
tcp.tcp_ysql.downstream_cx_rx_bytes_total: 693
tcp.tcp_ysql.downstream_cx_total: 1
tcp.tcp_ysql.downstream_cx_tx_bytes_buffered: 0
tcp.tcp_ysql.downstream_cx_tx_bytes_total: 991
tcp.tcp_ysql.downstream_flow_control_paused_reading_total: 0
tcp.tcp_ysql.downstream_flow_control_resumed_reading_total: 0
tcp.tcp_ysql.idle_timeout: 0
tcp.tcp_ysql.upstream_flush_active: 0
tcp.tcp_ysql.upstream_flush_total: 0

Remove all the containers in the stack

We can remove the containers we have previously spun up using the command below.

docker rm -f $(docker ps -aq)

Summary

The recently released PostgreSQL filter from Envoy Proxy makes it extremely easy for developers and operations engineers to collect SQL statistics from YSQL, YugabyteDB’s PostgreSQL-compatible fully-relational distributed SQL API. The filter runs inside the Envoy Proxy sidecar container and works by simply sniffing the network traffic in a manner that is completely transparent to the database server. As a result, developers and operations engineers can leverage the integration without deploying and managing any additional software.

We welcome all users to give the integration a try today and provide us feedback via GitHub and Slack.

Sid Choudhury

SVP, Product

Related Posts

Explore Distributed SQL and YugabyteDB in Depth

Discover the future of data management.
Learn at Yugabyte University
Get Started
Browse Yugabyte Docs
Explore docs
PostgreSQL For Cloud Native World
Read for Free