From Mainframes to Microservices: Leveraging Change Data Capture in Modern Databases

Christiane Soto

What is Change Data Capture?

Change data capture (CDC) efficiently identifies and tracks data changes in a database, so that actions can be taken based on these changes. YugabyteDB’s CDC captures data changes in the database and streams them to external processes, applications, or other databases. It tracks and relays changes from the YugabyteDB database to downstream consumers using its Write-Ahead Log (WAL).

How Change Data Capture Works With YugabyteDB

YugabyteDB offers three DBaaS deployment models: self-managed, co-managed, and fully managed, in addition to its open-source version.

Our self-managed option (YugabyteDB Anywhere) offers a customer-managed control plane for creating, orchestrating, monitoring, and managing the database. This architecture includes a built-in change data capture (CDC) service. YugabyteDB’s CDC integrates with Debezium, an open-source Kafka connector that ties YugabyteDB into Kafka and other downstream sources.  It allows you to build a data pipeline connecting YugabyteDB to Kafka and those other downstream sources, so that changed data from YugabyteDB moves through Debezium and into Kafka.

YugabyteDB Change Data Capture - how it works - architecture

From there, it can be sent to various connectors like Elasticsearch, Snowflake, Amazon S3 or anything in the Kafka ecosystem, including ksqlDB and KStream.

Change Data Capture in YugabyteDB

YugabyteDB’s CDC offers several key features.

  • It operates as a log reader-based capture, working with YugabyteDB’s write-ahead logging (WAL) format.
  • It pulls transaction changes aggregated in micro-batches by the Yugabyte database.
  • It is timeline consistent (both row- and shard-based).
  • It supports JSON and Avro formats.
  • It adheres to Kafka’s semantics of at least-once delivery and offers adjustable time size/disk size-based retention, allowing users to customize retention settings on either the YugabyteDB or Kafka side.
  • It supports initial snapshots, so you can take a current snapshot of your data to use as the starting point for the feed into Kafka and proceed from that point with ongoing changes.
  • It facilitates cloud and on-premise change delivery. For example, it can sync data from on-premises environments to cloud-managed services like Snowflake or Redshift in near real-time, as long as there’s network connectivity.
  • It offers transactionally consistent CDC. This is a new feature as of YugabyteDB 2.20. This new feature provides an aggregated view of every transaction in the order it occurred, for any downstream system that requires it.

How Companies Use Change Data Capture (CDC) for Streaming Data

Let’s take a look at two Yugabyte customers who are successfully using change data capture.

  1. A large brokerage institution (which must, unfortunately, remain anonymous) uses change data capture to stream data into YugabyteDB. Their goal was to lessen their reliance on a costly mainframe system and gradually establish YugabyteDB as their primary system of record. Initially, the Yugabyte database served as the system of reference for the microservices applications, staying in sync with the mainframe through CDC. By using this approach, the brokerage firm was able to step up its adoption of YugabyteDB, reduce its footprint on the mainframe, and cut costs. Notably, they achieved a tenfold increase in performance, getting up to 200K business transactions per second, with latency at 10 ms (or less). This transition marked their shift from the mainframe to YugabyteDB for critical applications. Architecture Brokerage Firm Using YugabyteDB to transition from mainframe
  2. A large streaming media company (which also must remain anonymous) was using MySQL for the streaming workloads that supported their customer subscription and program catalog data applications. They needed to migrate from this MySQL-based architecture because it was not resilient or reliable enough, especially during high-traffic periods, say for a high-profile sporting event or a popular show. They did not want those systems (and apps) to go down and have their subscriber base unhappy and disillusioned.Recognizing the need for a more resilient database architecture, they chose YugabyteDB. We collaborated to develop a migration strategy from MySQL to YugabyteDB, incorporating a “fall forward” approach. The media company used a technology from Arcion to transfer data (using CDC) from YugabyteDB to a secondary MySQL database, ensuring it stayed in sync during the migration. This strategy significantly reduced the migration risk, ensuring they could fall forward to the in-sync MySQL database if needed. The result? YugabyteDB was smoothly and quickly integrated into their overall infrastructure, and they have seen significant improvement in their service delivery.

Additional Change Data Capture Resources

Additional Resources on Streaming Data

Christiane Soto

Related Posts

Explore Distributed SQL and YugabyteDB in Depth

Discover the future of data management.
Learn at Yugabyte University
Get Started Business
Browse Yugabyte Docs
Explore docs Business
PostgreSQL For Cloud Native World
Read for Free Business