Behind American Express’ Machine Learning Effort
American Express is no stranger to analytics and data management. But more recently, the financial services giant has been unlocking the power of machine learning.
The effort spans data-science know-how; a willingness to explore new or emerging technologies; the blending of existing and new team members; and key vendor relationships -- including ongoing work with MapR, the Hadoop provider. (See related story, debuting with a link here on Sept. 16).
In an exclusive interview, two of American Express's data science leaders agreed to share perspectives, milestones and details about the company's machine learning journey. They also describe American Express's key machine learning priorities for 2015. The featured experts:
- Chao Yuan, senior vice president and head of Decision Science for American Express.
- Sastry Durvasula, senior vice president of Enterprise Information Management and Digital Partnerships for American Express.
Now, onto the interview...
Information Management: Let's start with the big picture. What are your key big data and machine learning priorities for 2015?
Yuan: "Our big data and machine learning efforts center around four main objectives:
- Improving and enhancing service to customers
- Creating and facilitating commerce between our Card Members, merchants and partners
- Helping make our products more relevant to prospective customers and existing customers
- Managing risk"
Information Management: Was there any particular event or specific business need that started American Express on its machine learning journey?
Yuan: "American Express has a rich history of using data and analytics to create deeper relationships with potential and current customers, but it’s the advent of machine learning that has allowed our scientists to harness the full power of our data. American Express’ Risk & Information Management team in partnership with the company’s Technology group embarked on a journey to build world-class Big Data capabilities nearly five years ago. Big data analytics help us drive commerce, service our customers more effectively and detect fraud."
Information Management: I assume the big data initiative now involves multiple applications and machine learning efforts. Can you mention one or two more recent efforts, the business need, and the outcomes?
Durvasula: "As you know we have a long heritage of customer service, so creating unparalleled experiences for Card Members remains a top priority. A good marketing example would be when we suggest Amex Offers to provide merchant offers to Card Members. Our machine learning algorithm assesses things like the Card Members spending preferences and which offers have been redeemed in the recent past. Using all the data in our big data set, a group of offers are prioritized for each individual and they keep evolving as the algorithm learns and improves constantly. This process ensures we are providing the most relevant offers customized uniquely for every Card Member.
In addition, we use machine learning to identify potential fraud concerns whenever an American Express Card is used anywhere in the world. Our machine learning models help to protect $1 trillion in charge volume every year. Making the decision in less than 2 milliseconds, it allows us to approve charges at the point of sale, with the least amount of disruption to our customers. The point-of-sale decisions we make using machine learning in turn automatically trigger fraud alerts to our Card Members through instant emails, text messages and smart phone notifications. Card Members are able to verify charges through these channels ( as well as via our website/phone) very quickly, allowing them to continue with their transaction without further disruption. Machine learning is the latest stage in our evolution of fraud tools as we strive to continuously meet our customers’ expectations for trust and security."
Information Management: Looking back on the machine learning and big data journey so far, what stumbling blocks (if any) did American Express experience and how will you avoid similar challenges in the future?
Yuan: "We’ve learned some critical lessons as we’ve moved forward with our implementation of big data. First, is ensuring we have the right talent to make it happen. We brought in strong outside talent and combined this with home grown talent, team members who understood our business model and our culture.
One other key learning was that, in order for the power of big data to be utilized most effectively, it was important to put it directly in the hands of those people who could provide the greatest value and insight. We have hundreds of PhDs and data scientists who develop the capabilities not just so they can use them. They develop them to be user friendly so that these capabilities can be used by our marketing managers, our risk analysts and our sales teams. These are the people who can do the most with the insights we gain from big data, and this is a key objective in all of our big data development.
Finally, given our strong focus on what customers expect from the American Express brand, privacy, security and trust must be at the top of the list as we develop and utilize big data. We have built our capabilities from the ground up with the ability to govern the use of the data and manage access to it in a very granular way."
Information Management: We've also heard that MapR is involved in your efforts. Can you tell me a bit about American Express's original engagement with MapR and the overall relationship?
Yaun: “We don’t discuss specific details of the work we do with our partners. However, I can tell you that we’re working with MapR to enhance our abilities to analyze data in responsible ways that benefits our Card Members and business.”
Information Management: Just how large is the Hadoop storage grid at this point?
Durvasula: "Our big data ecosystem has been built leveraging Hadoop and other industry leading technologies, and supports all business units with multi-petabyte scale. The platform boasts best-in-class engineering metrics, which includes a 45 second TeraSort and a 1.65 TB MinuteSort. The platform is highly scalable with flexible architecture to meet the growing business demands."
Information Management regularly profiles data scientists who are driving the next wave of big data, machine learning, analytics and more. To suggest a potential interview please contact Editor Joe Panettieri.