How Experian Is Using Big Data
Some businesses look to big data to help them manage their information. Information service company Experian is looking to big data to help manage its business.
The company is probably best known for providing credit reports and scores and protection against identity theft. Experian also works with businesses around the world to manage credit risk, prevent fraud and target marketing offers. It employs about 17,000 people in 39 countries with total revenue for the year ended March 31, 2015, of $4.8 billion.
Like any company, Experian is always looking to reduce costs, speed the delivery of services and enhance customer relations.
Given that the company gathers so much data as part of its key business processes—an estimated total of 30 petabytes—big data and analytics technologies seemed like a natural fit.
More Big Data in Practices case studies:
“Many of the daily tasks of our connected lives now rely on big data,” says Kevin Busby, vice president, analytics product management, at Experian. “By analyzing not merely patterns, but the relationship between many different variables, data analysis allows more people to buy homes, more companies to expand their business and more individuals to safely manage their finances.”
Experian has deployed a number of big data technologies, including MapR’s distribution of Hadoop, Syncsort’s DMX-h data integration platform, its own Experian Data Labs big data applications, the Apache Hive open source data query system and Tableau’s data reporting tool.
Big Data Deployment
In the fall of 2014, Experian was looking to process more data in less time without having to add to its existing in-house database due to the extensive costs involved.
It decided to deploy a Linux high-performance computing cluster, which provided increased processing power at a cost-effective price as well as a new way to store more easily accessible data.
Experian also implemented MapR’s Hadoop open-source software framework for distributed storage and processing of large data sets.
At the same time, it brought in DMX-h, a data integration platform that allows companies to shift heavy workloads from data warehouses and mainframes into Hadoop.
A Hadoop cluster can provide more storage at lower costs than a storage-area network (SAN), according to an Experian spokesman. On top of that, each server in the cluster also contributes to the processing capacity.
MapR allows the use of DMX-h to directly access storage without the need for a code rewrite
Experian's analytic applications can then leverage Hive to access data in storage.
“As we evolve, we have a platform capable of performing high-speed sort and faster Hadoop query techniques,” says Tom Thomas, Experian's director of the data development technology.
A key piece of Experian’s big data setup is the DMX-h data integration product suite, which the company uses in two key ways.
One is for daily production batch file preparation in the MapR distribution framework. Experian runs proprietary models that read the data via Hive, which facilitates queries and management of large datasets residing in distributed storage to create client-specific and industry-unique results, including the ability to extract deep customer insights.
The other use of DMX-h is in the creation and aggregation of metrics—by applying real-time rules against data transactions—so benchmarks are readily available for the company to use.
“Most people [looked at] Hadoop and thought, distributed file system,” says Thomas. “We thought, leverage the MapR management console and MapR direct access to the Linux native file system on that same cluster . [W]e were immediately able to connect to that existing systems, and then harvest that data via new tools like Hive with Experian models. However, when the end users require, we can just as easily move data from the cluster to databases. The end-user tools are evolving and can now access data on the cluster directly.”
For reporting, Experian uses a data visualization tool from Tableau to display millions of metrics, from separate data points to a landscape of relationships and priorities within data.
The bottom line is increased speed and reduced solution complexity, through less processing and handling of data, Thomas says.
Experian is exploiting its big data solution in several areas: fraud prevention and credit scoring as well as offering market-specific solutions for healthcare, car buying and small businesses.
Experian’s fraud reduction service is an area where big data is playing a big role. The company’s infrastructure allows it to protect consumers by using more than 19 terabytes of data and a list of more than 282 predictive attributes to identify true customers.
“The end result is enabling consumer’s access to a seamless experience—online and on mobile devices—with any businesses that uses our fraud prevention services,”says Keir Breitenfeld, VP of product strategy for Experian's Fraud & ID business. “Combining big data plus analytics, we are able to authenticate users more efficiently so they don’t have a time-consuming process and thus create a negative experience, while at the same time keeping fraud rates down [and] protecting their information.”
On the credit scoring front, Experian’s database contains credit and payment performance information on more than 25 million businesses nationwide. This information, and the insights derived from it, helps financial institutions and other businesses make more sound lending and credit decisions, which enables them to better manage their portfolios and acquire the right customers.
In addition to serving slices of the overall population, Experian is using big data to help people pay for healthcare services.
Experian serves more than 2,900 hospitals and 10,000 other healthcare organizations, representing more than 100,000 providers nationwide, Busby says. Big data has played a role in enabling Experian to help these organizations and their patients deal with increasingly complex healthcare payment issues. Through unique data and analytics, Experian provides insight into each patient’s financial situation, enabling providers to easily and efficiently determine which patients meet the requirements for Medicaid and other grant or charity programs or set up a payment plan that fits within their current budget.
Experian is also able to provide better service for consumers looking to buy automobiles.
For example, the company’s AutoCheck database houses 5 billion records from auto dealers, government agencies, auctions and other independent sources and produce reports that allow consumers to see if a vehicle has been in an accident, has odometer issues, frame damage, title issues or any other potentially negative events that may affect the vehicle’s safety or value, Busby says.
In addition, Experian helps small business owners by providing access to critical data regarding the health of their own business. This information helps them better understand and manage their credit, which in turn enables them to obtain funding that will increase cash flow and uncover growth opportunities.
Experian’s big data initiative also allows the company to more effectively help companies anticipate the ever-changing consumer environment, by providing marketers with a complete picture of how customers think, and what they do in a multi-channel, multi-cultural world.
“We process more than 1,151 billion records annually, with a global segmentation of more than 2.3 billion consumers in more than 30 countries, and demographic data on 700 million individuals and 270 million households combined,” says Emad Georgy, senior vice president of product development and global head of development of Experian Marketing Services. “That global data asset powers our Experian Marketing Suite. We help brands identify, authenticate and understand their consumers across channels and devices and then engage with [and] respond to them from a single platform.”