Zions Bank, the Salt Lake City financial institution, is using big data to beat back hackers and fight fraud. Using the Hadoop software platform and other big data technologies, the bank is able to quickly analyze transactions and spot suspicious activity almost as fast as the potential problems arise.
The bank, a subsidiary of Zions Bancorporation, operates nearly 500 offices and 580 ATMs in 10 western U.S. states. It offers commercial, installment and mortgage loans; trust services; foreign banking services; electronic and online banking services; automatic deposit and nationwide banking and transfer services; and checking and savings programs.
Like other big banks, Zions is a target. Financial services firms were among the most targeted organizations for security breaches over the past year, with 52 companies surveyed from that sector reporting a total of 642 security incidents and 277 reporting a confirmed data loss, according to Verizon’s 2015 Data Breach Investigations Report.
Zions’ first launched its effort to use data analytics in the battle against fraud in the early 2000s—well before big data became a popular term. Then it took a significant step up by deploying a Hadoop-based storage system called M5 from MapR about four years ago to control all the data it was collecting on a regular basis.
Since it first deployed the system, the bank has continued to expand the number of new data sets it has loaded into its Hadoop systems and has increased the number of fraud and security models that it has developed to counter security threats.
“Prior to using Hadoop we had a problem with reporting against very large data sets,” says Michael Fowkes, Zions Bank senior vice president of data science and security analytics. “Reporting was too slow and it was cost prohibitive to expand the old system to meet our reporting/analytic needs.”
But big data addressed that challenge and the bank says it’s much more secure because of the technology.
Fowkes would not say specifically how much the company has saved through reduced fraud or security incidents, or even provide an estimate. “But I can say that the system pays for itself year-over-year,” he says.
The original business driver behind Zions’ big data effort was information security and compliance with regulations, such as the Sarbanes-Oxley and GrammLeachBliley Acts, Fowkes says.
“We were looking for a solution to archive and report against large amounts of information security log files,” Fowkes says. “We quickly discovered that this data and [analytics] were beneficial in performing forensic work with security incidents and in detecting fraud.”
The bank’s security analytics team maintains its own big data stores and uses the R programming language to build models to detect fraudulent activity such as phishing attacks and suspicious financial transactions.
The team gathers data from a number of sources, including its banking transactions files, such as wire transfer, checking, credit and debit cards and automated teller machines; IT server logs, which the bank maintains as part of its effort to meet regulatory compliance requirements information; and its security logs, which pull data from its firewall, domain name system, virtual private network, among other systems.
All told, the bank has a little more than 1.6 petabytes of data stored in its MapR Hadoop system.
Because Hadoop scales linearly, capacity planning is much easier, Fowkes says. “When we need additional capacity to store or process data, it has been easy to add additional servers into our existing cluster with zero downtime,” he says. “And since Hadoop performs well using commodity hardware, we have a good understanding of what it would cost to expand our system.”
Other key technology components of the big data initiative include
Apache Hive data warehouse software, which runs on top of Hadoop and which the bank uses to manage most of the data stored in the MapR system; and a separate, open source storage system from MongoDB that the bank uses for real-time modeling.
Another component is Apache Storm, an open source, real-time event processing application that Zions uses for real-time fraud monitoring. Storm is designed to make it easy for companies to reliably process streams of data, doing for real-time processing what Hadoop does for batch processing, according to the Apache Software Foundation, which supports the Apache open source development community.
The bank is also testing another open source product, Apache Spark, an engine for large-scale data processing that Apache says runs programs at up to 100 times faster than the Hadoop MapReduce programming model in memory and 10 times faster on disk.
Better Security and Fraud Detection
All of these big data components together have enabled the bank to store, access and analyze large amounts of data as never before—and ultimately enhance security.
“Some of the information security logs and the banking transaction logs are really large in size,” Fowkes says. “In the past, with our existing tools, to run a report against that data, it would take over a day to get the results we needed. With Hive and Hadoop we can get the results in less than 20 minutes.”
The bank now has a quicker turnaround with security forensic activities. “When we had a security incident in the old days we had to track down which server was involved in the incident and then get access to the log file and it could be a time-consuming process,” Fowkes says. “Once we had all of the data centralized within Hadoop we didn’t have to go through that same process. We’ve gone from taking days to a matter of hours to accomplish the same work.”
Fraud detection efforts have seen a similar increase in efficiency, with the process of developing new fraud models going from weeks with the old methods to a matter of days. Some of the fraud models have been built for online and mobile banking, and for wire and ACH transactions, Fowkes says.
What affect has the big data initiative had on improving security and addressing fraud?
“It’s had a big impact,” Fowkes says. “Prior to having this environment most of the fraud detection we did involved using commercial detection tools. If we were targeted with a new type of fraud we had to try to convince the vendor to address this by adding a new feature into the product, and then it might take six months to see that.”
With the big data tools “we have access to all of this consolidated data and the ability to quickly build new fraud models, which allows us to be more nimble,” Fowkes says. “If we see a brand new attack we can use [data about the attack] to build a model and deploy it within days instead of months. We can respond to new attacks much faster, which means reduced losses to fraud.”
Zions Bank sees huge potential for big data and analytics for the overall business, and what began as an initiative to bolster information security and prevent fraud has broadened in recent years, Fowkes says.
“Centralizing our data stores serves multiple uses—from data security to fraud detection to risk management to customer marketing,” Fowkes says.
While he declined to provide details on these initiatives for competitive reasons, he says one project involves enabling
business intelligence (BI) teams throughout the company to access banking transactions data in Hadoop, so they can make operational improvements in a number of back-office banking processes.
“We’re working on getting our BI tools that are used for other reporting purposes to interact with the data in Hive so users can get access to more granular transaction data,” Fowkes says. “This is data they don’t currently have access to, and it can help increase efficiency in back-end banking processes.”
The bank is also looking into deploying another open source product, Apache Mesos, to simplify its big data environment. Mesos abstracts CPU, memory, storage and other compute resources away from physical or virtual servers.
“Hadoop, MongoDB, Storm and Spark now each runs on its own cluster of servers,” Fowkes says. “With Mesos, we can run all those on the same cluster of servers.”
Register or login for access to this item and much more
All Information Management content is archived after seven days.
Community members receive:
- All recent and archived articles
- Conference offers and updates
- A full menu of enewsletter options
- Web seminars, white papers, ebooks
Already have an account? Log In
Don't have an account? Register for Free Unlimited Access