The Power of One. Building a Data Platform That Unlocks the Value of Data
Jay Duraisamy is the SVP of Technology for Data and Analytics at Fiserv. He recently chatted with us about his experiences driving data-driven innovation in financial services, and how customer expectations are fueling change. He also walked through how Fiserv is unlocking the value of their data and providing new services for both merchants and consumers by bringing together data and analytics on one platform.
Below is an excerpt of the key points covered during the conversation. Continue reading or watch the full interview, where Jay was joined by Matt Aslett, VP and Research Director with Ventana Research. Below is an excerpt of the key points covered during the conversation.
Continue reading or watch the full interview, where Jay is joined by Matt Aslett, VP and Research Director with Ventana Research.
Q: Let’s start with a quick introduction. Can you share an overview of Fiserv and then discuss your background and your current role at the company?
Fiserv is a global fintech and banking solution provider, offering a wide range of banking, e-commerce, merchant acquisition, billing and payments, and point-of-sale services. We handle the movement of over 1.5 trillion US dollars on our platform. I joined the company about a year ago to be part of a newly established business unit—the data and analytics division. Our goal is to gather data from various business units, align that data, and unlock its value to provide newer and better services for our merchant customers and consumers.
Q: Can you talk about how customer expectations have changed over the past five years, and what are some of these wider trends that the financial services industry is responding to?
Customer expectations are changing. Customers are looking for (demanding even) differentiated services. We have a lot of data coming in from our processing systems. The challenge we face is how to bring all this data together in one platform to create a complete 360 degree view of the consumer and merchant, so we can serve both with new services that provide value—some unique and some shared. One way we can provide value, opposed to traditional data systems, is from, what we call, our lack of latency. Because of this, we can access data we are bringing in across all our different business units—in real time—to use for risk and payment verification purposes. This is the main focus of our data and analytics division.
Q: Can you give some examples of new services or enhancements that have been rolled out to customers or consumers?
Let me give you a couple of examples. One way is to assist small businesses and merchants who need loans or verifications. Traditional bureaus often have to work with old and limited information about the business, something we call “thin files.” Our goal is to update and enhance the information in that environment so that we can help our customers run better risk assessments or verifications using our data. This is actually a great use case and really highlights the power and value of our “lack of latency”.
Another example lies on the banking side of our business and how we bring all this data together to create a 360-degree customer view to correctly identify (and prevent) fraud across any of our business units.
Q: You are building a data platform that will allow data and analytics to work together to help business units across Fiserv monetize their data. Can you give a high-level overview of the design considerations for this platform?
The fundamental principle behind our data platform is to bring all the data together to create a 360-degree view of the customer. This involves connecting data, and all the corresponding dimensions, from different business units and linking them together. Another extremely important consideration is that this platform (the entire ecosystem, actually) has to serve customers in both batch and online modes. We don’t want to build different platforms within different ecosystems. We want one system with one ELT and a linked data pipeline that pushes the latest data for online use while maintaining the historical data for batch processing.
Q: With different use cases requiring batch or real-time processing, how do you approach the design of the platform? Do you build it based on specific use cases or after you aggregate all the requirements? How do you work with different business units and app development teams to get the data and meet their needs for both today and the next five years?
For a company our size that continues to grow, we have many different types of systems—from modern to mainframe. So, data is coming in real-time and in batch. It’s flowing in from retail tools and being keyed in.
Our philosophy is to bring all that data in only once and maintain it as the source of truth. How that has been traditionally handled is by looking at a use case, taking a subset of data from a subset of systems, and building up a data ingestion pipeline using the technologies and the architecture available at that time. Once that is completed, you start building up more use cases. The problem is that, over time, you end up with multiple architectures and multiple ways of reaching over into systems and pulling out data. We wanted to avoid that.
We want “one architecture.” This requires developing a generic pattern for data ingestion, including streaming, batch processing, ETLs, and cataloging. By doing this, we avoid having multiple architectures and data pipelines. Instead, we want to bring in the data as a source of truth from the source system and maintain it once, and then build multiple products on top of it.
Q: Your data analytics team is both a business unit and works to serve other teams and meet their needs. How have the needs of, for example, your app teams changed? Are they looking for different things from data? How are they thinking about the world?
The app teams’ needs have changed and it aligns with what I call the evolution from Data 1.0 to Data 3.0. Data 1.0 involved collecting transactional data in a relational database. Data 2.0 involved using a data warehouse and star schema for business analytics, reporting, and intelligence. Data 3.0 is about making real-time decisions using our data. Now, the balance comes in continuing to do Data 1.0 very well—it is our bread and butter after all—while growing the Data 3.0 part of our business, so that we can serve our customers better through horizontal cross-functional, value-driven solutions for all our data views. This requires an infrastructure that is very horizontally scalable and distributed.
Q: Let’s talk about the components of your modern data stack. What are the pieces that go into building this modern data stack for data and analytics?
Fiserv has taken a hybrid approach, and when I say hybrid, I mean that we do have an on-prem architecture, but we choose tools that are available for the cloud as well. For example, we use ETL tools like Integrate.io or Talend and also Kafka for streaming data into our ecosystem. We store the data in a cloud-native file system like Azure Data Lake Gen 2, and we also build out batch systems as part of our pipeline. Our goal is to make the data available in real-time through APIs for our decision engines. This is where distributed SQL comes into play.
Q: We see many companies are talking about this concept of a consolidated data and analytics platform using a stack of Snowflake, YugabyteDB, and Kafka. Snowflake for the data warehouse, Yugabyte DB for the transactional layer, and Kafka as the connection between the systems. These systems are cloud-native, scalable, and highly resilient. Does this match your experience as well?
Absolutely. The MPP architecture of Snowflake fits well for high-throughput, large-volume data processing in a batch manner. However when you talk about running microservices with millisecond responses we need a distributed SQL architecture. Distributed SQL is important to us because our SLAs are very low, so we have to be very strong in our ability to scale and in our ability to do so in under 200 milliseconds. Both systems are critical for us and for our customers.
Q: What are your thoughts on the changing landscape of relational databases with the emergence of distributed SQL and how can distributed SQL help serve your needs?
I think the world has moved from monolith to microservices and having microservices on top of a monolithic platform only allows you to scale to a certain extent. That’s why many data-centric products and solutions move towards a NoSQL architecture. And as we know, NoSQL comes with certain trade-offs. I always look for a solution that can merge these two. YugabyteDB presents a solution with both PostgreSQL grammar and Cassandra grammar as well as being fully ACID compliant. It seems to be the next evolution of NoSQL architecture. We are still in the early stages—as you know—but it is definitely promising.
Q:You mentioned in your introduction that you are part of a business unit at Fiserv. So with that in mind, how would you measure the business value of your data and analytics stack, particularly for companies looking to modernize?
In my opinion, the best way to measure the business value is through a platform- and product-centric approach. First, you need to identify the customer problems that you are trying to solve that cannot be solved with traditional architecture.
For example, if you are looking to build a comprehensive, company-wide fraud solution you would need to bring in data from all your fraud systems and subsystems, collate it and keep it in a time series database—all at extremely low latency.
On the other hand, if the customer wants to bring in their data and use it for analytical reporting, a different architecture like Snowflake may be more appropriate. It all depends on the context and what the customer needs from a platform/product-centric perspective.