How change data capture technology aids in real-time analytics
(Editor's note: This article, the third in a three-part series, was excerpted from "Streaming Change Data Capture: A Foundation for Modern Analytics." This book was published in May by O’Reilly Media and is available now for download.)
In the first article of this series, we examined how change data capture software enables continuous incremental replication by identifying and copying data updates in real-time. When implemented effectively, CDC enables enterprises to be agile and efficient while meeting requirements for modern analytics.
The second article examined how CDC technology is driving new data architectures for firms in the healthcare, manufacturing and financial services industries.
These next two case studies, both in the financial services sector, demonstrate additional benefits of processing incremental data and metadata updates in real-time rather than resource-draining batch (a.k.a. full) data loads.
Case Study 4: Supporting Microservices on the AWS Cloud Architecture
The CIO of a very large investment management firm, which we’ll call “Nest Egg,” has initiated an ambitious rollout of cloud-based microservices as a way to modernize applications that rely on data in core mainframe transactional systems. Its on-premises DB2 z/OS production system continuously processes and reconciles transactional updates for more than a trillion dollars of assets under management. Nest Egg has deployed CDC to capture these updates from the DB2 transaction log and send them via encrypted multipathing to the Amazon Web Services (AWS) cloud. There, certain transaction records are copied straight to an RDS database, using the same schemas as the source, for analytics by a single line of business.
In addition, CDC copies transaction updates to an Amazon Kinesis message stream to which Nest Egg applies custom transformation logic. As shown in the figure below, the transformed data then arrives in the AWS NoSQL platform DynamoDB, which feeds a microservices hub on RDS. Multiple regional centers, including offices based in London and Sydney, receive continuous updates from this hub via DynamoDB Streams.
As a result, Nest Egg’s microservices architecture delivers a wide range of modular, independently provisioned services. Clients across the globe have real-time control of their accounts and trading positions. And it all starts with efficient, scalable and real-time data synchronization via CDC.
Case Study 5: Real-Time Operational Data Store/Data Warehouse
A military federal credit union, which we’ll call “USave,” involves a relatively straightforward architecture. USave needed to monitor deposits, loans and other transactions on a real-time basis to measure the state of the business and identify potentially fraudulent activity. To do this, it had to improve the efficiency of its data replication process. This required continuous copies of transactional data from the company’s production Oracle database to an operational data store (ODS) based on SQL Server.
Although the target is an ODS rather than a full-fledged data warehouse, this case study serves illustrates the advantages of CDC for high-scale structured analysis and reporting. USave deployed CDC software on an intermediate server between Oracle and SQL Server. The company automatically created tables on the SQL Server target, capturing the essential elements of the source schema while still using SQL-appropriate data types and table names. USave was able to rapidly execute an initial load of 30 tables while simultaneously applying incremental source changes. One table of 2.3 million rows took one minute. Updates are now copied continuously to the ODS.
As illustrated in this series of articles, CDC software is the driving force behind the rise of modern data architectures that improve the efficiency, scale and speed of data consumption, while reducing the impact on production applications. Companies are becoming data-driven as they unlock data’s potential to drive new revenue opportunities, enhance the customer experience and create greater operational efficiencies.