Distributed SQL Summit Recap: Cloud Native Spring for Relational Databases

January 15, 2021

Editor’s note: Below is the final recap from last year’s event. There’s still time to join us live for the upcoming Distributed SQL Summit, Jan 20-22 in India Standard Time.

At the Distributed SQL Summit 2020, DaShaun Carter, formerly a Tanzu Solution Engineer at VMware presented the talk, “Cloud Native Spring for Relational Databases”.

In the talk, DaShaun takes us down about a possible path to distributed SQL, keeping the discussion at a 101 level.

DaShaun kicks the presentation off by inviting the audience to consider a use case: A Java app, RESTful services, and single node PostgreSQL. For example, an MVP that he wrote 10 years ago, it went to production, and growth happened. DaShaun adds that the app only has value when data is written to the database, like a landing page when collecting a lead, or a sale of something; the write to the database kicks things off. In this use case, you’ve vertically scaled as much as possible, and now you need even more scale. How best to approach this?

DaShaun invites the audience to consider the cost per transaction. How much does each write to the database cost? He describes how the math goes into three buckets, political, emotional, and financial. He adds that there are three stakeholder groups to consider: DBAs, app devs, and leadership. From a DBA perspective, scaling by going from single node Postgres to distributed Postgres (via YugabyteDB) seems like a relatively straightforward move to make. And from the app dev perspective, going from Spring to Spring (because the Spring Data support with Postgres will continue to work with Yugabyte’s Postgres compatible YSQL API as-is) would help accomplish goals for scale, without having to change a bunch of code. Then there’s the third bucket, the leadership bucket of people – directors, VPs, or even C suite – who care about the business value of making the decision.

DaShaun is a self-professed Spring fan, he has seen the value and has taken it to production, and it makes him happy … including Spring Data. Scaling reads is easy; scaling writes is the tough part. And how do you predict growth? It’s an expensive mistake to account for 10 billion ops/day, and only need 5 billion. It’s equally expensive to think you’ll need 10 billion ops/day, but you actually need 27 billion.

And maybe you’ve been in a scenario where you needed to enter into a new location for GDPR.

DaShaun makes a few recommendations:

Spring Data R2DBC is an option to solve some of the at-scale problems, keeping in mind the cost per transaction.
If you’re on a journey to solve for scale around your SQL data, considering the cost per transaction will help you when backing up your decisions.
Work together across app dev, DBAs, and leadership teams to make decisions together to drive ROI at scale.

He also points out that, if you haven’t already, start measuring your cost per transaction, whether it’s reads or writes. Your MVP likely won’t see issues around cost per transaction because it’s at a low scale; but when you hit 30 billion writes or more, the number becomes really important. Consider what number you’re trying to get to.

DaShaun goes on to show some sample code, using Spring Data R2DBC, which is a module in incubation. If you know Spring, the code using Spring Data will look familiar. If you’re used to Postgres, the code pointing to YugabyteDB will look familiar.

DaShaun concludes by stating that if you’ve had problems scaling, and you want to quickly move to a more scalable solution, consider Spring Data and YugabyteDB distributed SQL. You don’t have to change a lot of code, as an app dev or DBA. Your cost per transaction may be agreeable. And the speed to market may be quick. Work together across app devs, DBAs, and leadership, to make a joint decision that is data-driven, including cost per transaction.