How Harte Hanks Turbocharged Big Data
Harte Hanks, the global marketing organization, is no stranger to big data. An innovator in data-driven promotions, the company offers an array of integrated, multi-channel analytics, database marketing, data quality and other information-intensive services to major BtoB and BtoC brands worldwide.
Harte Hanks clients include Hyundai, Mercedes and Sony, to name a few. Companies typically use the marketing services firm to obtain a total view of each customer, to execute personalized promotions and to create cross-channel campaign analytics that give clients realtime information. But, given the sheer explosion of customer inputs, ranging from point-of-sale, ecommerce, social media, mobile and online visits as well as call center contacts, this is no easy task. The avalanche of data coming into Harte Hanks is growing both in volume and complexity.
“Big data applies to our business in a number of ways,” says Rob Fuller, Harte Hanks’ managing director of product innovation. “With the number of companies vying for the customer’s attention, there is a need to integrate so much more data. But these solutions are very costly to scale.”
Harte Hanks provides its clients with a hosted marketing service called Allink that includes a database that consolidates into a single view all interactions by its clients’ customers. The service provides reports as well as analytics enabling clients to study data and develop marketing and sales scenarios. Using this insight into customer behavior, marketers are able to create more sharply tailored campaigns.
But when the company’s clients started encountering sluggish query response times -- up to an hour-- for a marketing scenario, Harte Hanks began searching for a way to supercharge its big-data-based services.
A traditional solution would be to simply add more servers and database capacity. The company evaluated whether to continue scaling up with larger and more expensive computers. However, Harte Hanks discovered that in addition to escalating hardware costs, it faced the likelihood of having to invest in Oracle database upgrades to accommodate customers’ needs.
The solution Harte Hanks found centered around Splice Machine, a big-data based relational database management system, and Cloudera CDH, which contains the Hadoop big-data software framework and a number of other features including data security. The Splice Machine and Cloudera combination now support the digital marketing applications that are part of Harte Hanks’ Allink service.
Splice Machine’s RDBMS connects programs written in SQL -- the lingua franca of many corporate business intelligence systems -- with the big data-driven world of Hadoop and HBase, an open-source, distributed, non-relational database. In effect, Splice Machines acts as a standard SQL database that works with the distributed computing infrastructure of Hadoop.
But the result is anything like a standard RDBMS.
“In order to deliver the scalability of Hadoop, you need a platform that everybody understands such as SQL,” says Monte Zweben, co-founder and CEO of Splice Machine, a 2-year-old company based in San Francisco that released its initial product in November. Splice Machine, being a distributed database, connects with a cluster of Linux commodity servers, parsing a portion of the processing work to each server running HBase, which runs the work in parallel. Splice Machine then “splices” the results of each distributed processing job back together.
“The big data world is transitioning to a new platform for realtime applications and reporting,” Zweben points out. Facing a similar challenge to that faced by Harte Hanks, “many companies are experiencing pain,” Zweben says. “The velocity of all that data coming into their system is too much to handle.”
Similar to Oracle and MySQL, Splice Machine can handle typical operational or analytical queries. What’s more, it can scale from terabytes to petabytes using inexpensive commodity servers -- just as Hadoop and HBase scale to dozens of petabytes on commodity servers.
“The ability to get really phenomenal performance matters, because it enables marketers to do more,” Harte Hanks’ Fuller explains. “Now we are able to do in 13 seconds what used to take a minute and a half, or to do in eight minutes what used to take 35 minutes on the same infrastructure.
“It means you can do that many more marketing scenarios,” he adds. “You can explore more opportunities. For instance, I can actually tell you what a person is likely to purchase next. And we can make marketing recommendations in realtime. In terms of marketing effectiveness, it means you can develop more intelligent marketing campaigns.”
But putting together this high-performing solution for clients required more than just a single big data technology. Harte Hanks utilizes a stack of technologies, among them:
- IBM Utica’s retail campaign management.
- IBM Cognos for reporting.
- Tableau’s business intelligence dashboard.
- Harte Hanks’ own Trillium data cleanup and governance system.
The bottom line for Harte Hanks is the ability to sharpen performance for clients at a reduced cost. The company has achieved a three- to seven-times increase in query speeds, at 25% of the cost of total infrastructure.
“This is ideal,” sums up Fuller. “I can tap into the value of big data and move my clients along that continuum.”