BNY Mellon is taking on a major IT project that's reimagining everything about how the $1.4 trillion-asset bank stores, crunches, uses and delivers data - perhaps the second most important four-letter word in banking after cash.

The bank's IT executives have realized that strong data management and analytics are no longer a nice-to-have, but a must-have.

"If you log into Amazon, there's the ability [at Amazon] to understand who you are and what you have done in the past," says CIO Suresh Kumar.

Kumar is speaking glowingly about Amazon's ability to drive experience based on what it has learned about users - something the bank hopes to accomplish. In a new global tech initiative, BNY Mellon will leverage elements of what's called "big data" - or what Kumar says is the greater use of larger volumes of nontraditional unstructured data and search analytics. But more than that, the bank will challenge its employees to see their own business and relationship with the larger bank differently, in a more collaborative manner in which information gathered for a single purpose can be crunched to help decisions enterprise wide. When a customer accesses the bank through any channel, all of the information about the interactions the bank has had with that client can be made available. That will drive what the user - either a client or an employee -hears and sees next from the bank.

"It's really a different mindset. It's not that we didn't keep the data to accomplish this before. For example, for regulatory purposes we had to keep certain transaction and customer information that could help serve clients in other ways. But for the most part, we would only use the data if a regulator asked for that information. But why can't we use that for other purposes?" Kumar says. For example, the bank hopes to repurpose trade data to efficiently search cost basis information - or the original value of an asset for tax purposes adjusted for splits, dividends and return of capital distributions - which is used to determine the capital gain. By obtaining this information, the bank plans to determine and provide intelligence about what a client is likely to look for in an effort to expand customer service for the person that's executing the trades. "There's a lot of insight you can get from looking at a history of trades...It's not necessarily that we need more data, it is a matter of gaining insight into the data we have," Kumar says.

BNY Mellon is embarking on its project at a time when almost all banks are sorely behind when it comes to collecting and crunching customer and market information to be an informed partner with its customers and staff. Entire retail industries are moving ahead of banking when it comes to reaching customers where, when and how they want to be reached - with the right message at the right time on the right device - with suggestions that actually resonate with consumers. It's a holy grail that still eludes almost all traditional financial institutions, and it's why you hear so much about PayPal, Square, Movenbank, Google and even Walmart doing the innovative work in user experience that banks should be doing.

"The customers and employees have great tech at home...Twitter, Facebook, Google, and they expect the same from us," Kumar says. "That experience is what we have to compete with for our portal. Our challenge is, 'What can we do to improve user experience for clients when they log in? What can we learn about them to give them what they need at the right time?'"

Big data is still a vague concept that generally starts with an exceptionally large data store, sometimes supplemented by added data sources, such as social networking sites, mobile commerce and web-enabled financial management tools that aggregate financial and payment information.

The bank's goal is to make BNY Mellon's web analysis work "more like Google" in the sense that when people log into the bank's site from a PC or mobile device, or a more traditional channel, the bank is ready with a full picture that leverages all of this new data to anticipate a service query or a need. For example, Google's analytics program provides marketers with information on a consumer's web visits, the visits' geographic origin, time of day, amount of time spent on site and what parts of a site that consumer visited.

At BNY Mellon, the new data project will include the use of new cross-departmental database technology and search technology that will acquire broader data on users and take greater advantage of the data that the bank is already accumulating by allowing it to be aggregated and shared across the enterprise for different purposes. "Any kind of data that we used to just archive and put away for compliance reasons or for bank operations is an opportunity to get value. And there is also a ton of information that is collected that gives us insight into how people use different types of technology for different transaction types," Kumar says. If it works well, the bank's customer service and delivery of digital products such as payments, transactions and personal financial management will be more meaningful. And operational, legal and credit risk will also improve, because the bank will be storing and accessing the right amount of data for the right amount of time for legal queries and compliance.

Searching for Answers

As part of the new Big Data project, the bank, which employes 13,000 technologists globally, will seek to move beyond the siloed and department-centric manner in which data has been stored and analyzed. BNY Mellon also hopes to enable centralized access to its data regardless of which data center it chooses for storage. The bank has data centers around the world and its strategy will be independent of physical location.

"Traditionally, with most financial institutions, you get numerous screens with lots of fields for filtering data. But the expectation of users is that the system already has sufficient pieces of information on what that customer is looking for. If you look at the Google approach, where you get a single box that is smart enough to know what you are looking for, that is what we are going to do," says Kumar.

The bank is deploying open source and NoSQL technologies that allow data to be centrally accessed by different departments for disparate uses. NoSQL refers to a database management system that's optimized to retrieve and append operations, which is considered more scalable and useful when managing a large amount of data. One of the main use cases for NoSQL is to analyze social media posts or web server logs from large groups of users.

BNY Mellon's database tech includes MongoDB, Cassandra and Lucene, which are designed to scale based on need and expansion. MongoDB is an open source NoSQL document oriented database which Kumar says can store time series effectively in addition to content management. Kumar says that in a large organization, there are lots of applications that create and store operational data. For any application that needs data across the enterprise, such as risk, the application typically receives that data from its source system and creates a data warehouse. Over a period of time, he says you end up with too many data warehouses with redundant data. The bank plans to use Cassandra to store the data once. "The quality of data is better because we have [centralized access to] data to keep track of instead of multiple [department data sources]," Kumar says. He also says that by using open source technology, the bank is able to scale to store more data at less expense because it's managing less hardware. "Your cost of ownership goes down."

To improve web search analysis, the bank is using Lucene search engine software to index both unstructured data, including non-traditional sources such as social networks, and structured data that includes more traditional transaction and customer records. Written in Java, Lucene is an open source text search engine library that allows ranked searching and different types of search queries, such as searches based on phrases, wildcards (symbols such as "?," that can be used in place of actual words), and regional proximity, which can be useful in location-based marketing. Lucene, whose actual search engine technology is called Solr, was updated in October to include a new web based user interface, a spell checker and support for data that aids in geographic searches. Kumar, who called Lucene "incredible," says it is in production presently and has had a huge adoption in a short period of time.

"One of the challenges we've always had with projects is answering questions such as, is it on time and is it on budget? But also, you really want to know if people are actually using the product that resulted from the project, and why they are using it or not," Kumar says. "Having insight into how people engage the bank allows you to build new tech in a way that is much more likely to be used."

BNY Mellon hopes to find out information such as the time between the rollout of a product or new piece of technology and when that initiative reaches certain levels of adoption - along with user sentiment that can give a clue as to why there are adoption shortfalls or abandonment of user sessions. "Was it due to a lack of awareness of the product's benefits, for example? That would lead us to improve the messaging around the marketing," Kumar says.

While open source should reduce cost and increase scale, it doesn't come without challenges, one being change management as the bank's team of thousands of tech workers adjust to using open source and commodity hardware, or data storage and management that's built more on shared components than more proprietary systems.

"For anyone who's used to one way of doing something, they have to expand their horizon and see what other ways are out there, and what the pros and cons are," Kumar says.

Information Lifecycle Management

BNY Mellon also hopes to leverage centralized and expanded data analysis to benefit legal and compliance work.

Allen Cohen, managing director and CIO of BNY International, is helping lead an information lifecycle management initiative that's creating a central repository for all legal matters in flight or pending across all organizations at the bank.

By using a shared workflow driven by centralized database technology, BNY Mellon's legal and records information department (RID) can perform cross-department and broad geographic electronic discovery.

The strategy is leveraging work done by the Compliance, Governance and Oversight Council (CGOC), The council is a forum of more than 1,900 legal, IT, records and information management professionals from corporations and government agencies. The group discusses and produces guidance for records discovery, retention, privacy and governance - a job that's getting harder as data expands explosively.

The CGOC has found that for most organizations, information volume doubles every 18-24 months, and 90 percent of the data in the world has been created in the past two years. Storage of that data consumes about 10% of a typical firm's IT budget - a trend that's projected to reach 40% by 2014. The group has published the "Information Lifecycle Leader Reference Guide" to aid disposal programs for expired data. The guide includes tips on defining the economic and business objective of an information governance program, as well as establishing a program strategy, structure and organization that aligns functional silos to ensure business objectives and financial targets are reached. It also identifies new processes for defensible disposal.

With the help of these principles, BNY Mellon is in the midst of a multi-year initiative to centralize and automate data retention schedules for paper and electronic data - including structured and unstructured data that can be subject to legal discovery. The bank is unifying processes in legal, risk and IT and is providing a framework to link job duties and data across departments. It's also creating a global standard taxonomy that identifies the business value of all information and automates legal holds.

"All of the information in our organization is mapped to a records retention polity, which maps data types to people, applications and a records treatment policy," Cohen says. "Once you see all of that in one place, that gives you the ability to say this particular piece of information is no longer needed."

Better Than an On-site Visit

Other financial institutions are finding advanced data accumulation and storage can come in handy when older marketing techniques become less effective.

"A lot of employers don't like you to be onsite to do marketing. And it's expensive to have someone onsite to market or sell the credit union's products," says Mona Leung, CFO of Alliant Credit Union, an $8 billion-asset Chicago-based credit union whose members include staff from more two dozen sponsor organizations and 19 communities near O'Hare Airport.

Alliant is accommodating these challenges by delivering marketing and other consumer-direct messages on LinkedIn, professional blogs and other social networking venues typically used by employees of the credit union's target organizations.

Leung says the credit union is able to track data on users and preferences almost immediately - and that information informs further marketing and sales campaigns. Like BNY Mellon, Leung's not inventing the wheel when it comes to gathering information; most of this data is already being generated. "This unstructured data, such as notes or email or correspondence, has always been there, but now you can get the benefits."

When prospective members sign into LinkedIn, they receive a banner touting the credit union and its menu of financial services, along with a link to Alliant's landing page. "We are able to track where the users came from, either LinkedIn or another site such as Facebook, and we can get a sense of their activity on social networks, and that gives you an advantage when going after new members or customers," Leung says.

David Wallace, global financial services marketing manager for SAS, which provides the business intelligence software that underpins Alliant's growing data analysis efforts, says the expansion of data sourcing puts more of the interaction between consumer and the financial institution in play as a potential data-driven sales opportunity.

SAS recently developed a product called DataFlux Marketplace, a software-as-a-service (Saas) offering that integrates email addresses into business applications, processes and websites. This technology is designed to ease linkage between email and other parts of the enterprise in an effort to roll email information into broader analysis.

"You want to be able to capitalize on every interaction...those digital trails mean something," Wallace says.

There are also risks connected to "big data," namely that it makes information governance more difficult since there's additional data being generated, retrieved and stored-and by more people. IDC says that 1.8 trillion gigabytes of data were generated in 2011, more than the previous five years combined, and a vast majority of that was unstructured data, which is hard for banks to track because it's generated by social networks, email and other sources that reside outside of traditional transaction information.

"Everybody has software to create data now...it's hard to keep track of who owns what data. If you have a big company with sensitive information, you have to know how to allocate access to all of these tons of data," says Eric Kamander, identity service engineer for CIBC.

Varonis, an unstructured data governance software firm, in September surveyed 200 IT executives and found that two thirds were not confident that sensitive data was protected during the data migrations that typically follow a merger or large IT project.

While Varonis, which operates in a data ownership governance tech space that also includes firms such as Imperva and GovernanceMetrics, has an interest in the market, it's still worth noting that 65 percent said they were not confident that sensitive data was only accessible to the right people during a migration.

A client of Varonis, CIBC uses tech that tracks data that's manufactured and stored in locations such as file systems and Exchange email servers to audit data access and track "ownership"of the data, or what staff member generated or collected the information, and may be storing the data on his or her email server.

"Establishing ownership is the lynchpin to governance. You need to know who to turn to for each piece of data to authorize access, review access and remove unauthorized access," Kamander says.

He says there have been thousands of access revocations, many of them voluntary by staffers who were unaware of the risk of data storage, since CIBC deployed the governance tech last year, a move that's also allowed the institution to remove data that was redundant or being stored unnecessarily.

Kamander didn't give specific storage savings for CIBC, but says "The real savings is on the reputational side. You can't place a dollar value on reputational risk."

Another challenge in adopting Big Data is the technology expertise needed to manage the open source development and data analytics techniques.

BNY Mellon's Kumar spoke of the change in IT culture required by new data technology, particularly the need to become more comfortable working with open architecture and other shared technology.

But there's also the issue of finding technologists with the proper training, which will become a challenge as more firms in and out of finance look to leverage the benefits of big data.

Brian McCarthy, executive partner, Accenture Finance and Performance Management, adds there is a shortage of IT talent with big data skills in North America and Western Europe. He says the combination of computer science skills and data modeling is still early stage in education.

"Universities are working on new programs, but there won't be enough of the skills to satisfy the demand, so there will be an imbalance. There will be an early mover advantage to take advantage of this talent," he says.

5 Uses for Big Data

There are almost as many ways to apply new data accrual and management techniques as there are emerging sources of intelligence. BTN spoke with tech pros and analysts to find out some of ways in which advanced data can be applied. 

  1. Marketing
    As Alliant Credit Union's Mona Leung points out, social media information can be an invaluable source of data on how people are responding to marketing campaigns, and that data that can be incorporated into more traditional CRM systems. And social networking sites themselves are a good place to gain extra intelligence on bank branding. "There's an opportunity to pick up end user sentiment through social media, and to fine tune marketing campaigns," says Mike Versace, research director, IDC Financial Insights.
  2. Location
    Geolocation information, or data about a mobile user's geographic location, counts as "big data" for most analysts, who say information about where consumers are shopping can help inform special offers or loyalty programs. "There's a merchant intelligence that's possible here, optimizing offers that can really drive a lot of growth in mobile banking," Versace says. 
  3. Fraud prevention
    S Ramakrishnan, group vice president and general manager, Oracle Financial Services Analytical Applications, says companies have a better chance to spot people (internal or external) behaving badly, by having a broader view of relationships and activity. "When you are accumulating large amounts of data, what people are doing and where they are using that data is a part of that. There are some hidden behaviors that can be uncovered as a result of analysis of that data," he says. 
  4. Investing
    Unstructured data from sources such as financial or personal blogs can provide insight into the tastes and appetites of consumers when it comes to retirement or wealth management. "It's something that advisors can use to follow customers. It can give better and more meaningful information on a client. But it's still pretty early in the adoption cycle to see this used as a data source for investment management," says Omer Sohail, a principal in Deloitte Consulting LLP and Banking & Security Business Analytics and Information Management Leader.
  5. Mortgages
    Residential real estate is an ideal venue for alternative data, with a wide variety of information available from public sources, such as homeownership demographics, regional price changes, and blogs and social networking pages dedicated to homebuyers and sellers, which can give a lender a market pulse when combined with static sources such as real estate listings and consumer account information. When run through analytics, there's more information that can be used for underwriting, decisioning, credit risk and pricing. "Lenders can target customers more specifically because you can pull more data together to get a comprehensive view," says Brian McCarthy, managing director, Accenture Financial Services Analytics, North America. 

This story originally appeared at Bank Technology News.

Register or login for access to this item and much more

All Information Management content is archived after seven days.

Community members receive:
  • All recent and archived articles
  • Conference offers and updates
  • A full menu of enewsletter options
  • Web seminars, white papers, ebooks

Don't have an account? Register for Free Unlimited Access