"Traditionally, with most financial institutions, you get numerous screens with lots of fields for filtering data. But the expectation of users is that the system already has sufficient pieces of information on what that customer is looking for. If you look at the Google approach, where you get a single box that is smart enough to know what you are looking for, that is what we are going to do," says Kumar.
The bank is deploying open source and NoSQL technologies that allow data to be centrally accessed by different departments for disparate uses. NoSQL refers to a database management system that's optimized to retrieve and append operations, which is considered more scalable and useful when managing a large amount of data. One of the main use cases for NoSQL is to analyze social media posts or web server logs from large groups of users.
BNY Mellon's database tech includes MongoDB, Cassandra and Lucene, which are designed to scale based on need and expansion. MongoDB is an open source NoSQL document oriented database which Kumar says can store time series effectively in addition to content management. Kumar says that in a large organization, there are lots of applications that create and store operational data. For any application that needs data across the enterprise, such as risk, the application typically receives that data from its source system and creates a data warehouse. Over a period of time, he says you end up with too many data warehouses with redundant data. The bank plans to use Cassandra to store the data once. "The quality of data is better because we have [centralized access to] data to keep track of instead of multiple [department data sources]," Kumar says. He also says that by using open source technology, the bank is able to scale to store more data at less expense because it's managing less hardware. "Your cost of ownership goes down."
To improve web search analysis, the bank is using Lucene search engine software to index both unstructured data, including non-traditional sources such as social networks, and structured data that includes more traditional transaction and customer records. Written in Java, Lucene is an open source text search engine library that allows ranked searching and different types of search queries, such as searches based on phrases, wildcards (symbols such as "?," that can be used in place of actual words), and regional proximity, which can be useful in location-based marketing. Lucene, whose actual search engine technology is called Solr, was updated in October to include a new web based user interface, a spell checker and support for data that aids in geographic searches. Kumar, who called Lucene "incredible," says it is in production presently and has had a huge adoption in a short period of time.
"One of the challenges we've always had with projects is answering questions such as, is it on time and is it on budget? But also, you really want to know if people are actually using the product that resulted from the project, and why they are using it or not," Kumar says. "Having insight into how people engage the bank allows you to build new tech in a way that is much more likely to be used."
BNY Mellon hopes to find out information such as the time between the rollout of a product or new piece of technology and when that initiative reaches certain levels of adoption - along with user sentiment that can give a clue as to why there are adoption shortfalls or abandonment of user sessions. "Was it due to a lack of awareness of the product's benefits, for example? That would lead us to improve the messaging around the marketing," Kumar says.
While open source should reduce cost and increase scale, it doesn't come without challenges, one being change management as the bank's team of thousands of tech workers adjust to using open source and commodity hardware, or data storage and management that's built more on shared components than more proprietary systems.
"For anyone who's used to one way of doing something, they have to expand their horizon and see what other ways are out there, and what the pros and cons are," Kumar says.
Information Lifecycle Management
BNY Mellon also hopes to leverage centralized and expanded data analysis to benefit legal and compliance work.
Allen Cohen, managing director and CIO of BNY International, is helping lead an information lifecycle management initiative that's creating a central repository for all legal matters in flight or pending across all organizations at the bank.
By using a shared workflow driven by centralized database technology, BNY Mellon's legal and records information department (RID) can perform cross-department and broad geographic electronic discovery.
The strategy is leveraging work done by the Compliance, Governance and Oversight Council (CGOC), The council is a forum of more than 1,900 legal, IT, records and information management professionals from corporations and government agencies. The group discusses and produces guidance for records discovery, retention, privacy and governance - a job that's getting harder as data expands explosively.
The CGOC has found that for most organizations, information volume doubles every 18-24 months, and 90 percent of the data in the world has been created in the past two years. Storage of that data consumes about 10% of a typical firm's IT budget - a trend that's projected to reach 40% by 2014. The group has published the "Information Lifecycle Leader Reference Guide" to aid disposal programs for expired data. The guide includes tips on defining the economic and business objective of an information governance program, as well as establishing a program strategy, structure and organization that aligns functional silos to ensure business objectives and financial targets are reached. It also identifies new processes for defensible disposal.
With the help of these principles, BNY Mellon is in the midst of a multi-year initiative to centralize and automate data retention schedules for paper and electronic data - including structured and unstructured data that can be subject to legal discovery. The bank is unifying processes in legal, risk and IT and is providing a framework to link job duties and data across departments. It's also creating a global standard taxonomy that identifies the business value of all information and automates legal holds.












