Data control: How to get it, how to make it work for you

Register now

How hard is keeping track of data today? Business intelligence teams – which in large organizations are in charge of doing that – have mountains of data to contend with that just a few years ago were unimaginable. The manual search methods most of them use are just not able to contend with the sheer volume of data that the average organization collects.

The average American, for example, uses 4.1 GB of data on their cell phone every month, expected to more than double by 2021; an average gigabyte of data represents roughly 64,782 Word pages. That means that each smartphone user for a company like Verizon, for example – a company that has about 150 million customers – is generating 265,606 “pages” of data a month, or 3,187,274 a year, a figure that will grow to nearly 7 million by 2021.

BI teams, which are usually large and well-paid, often have to muddle through this data using nothing more than search routines they develop in order to parse databases and other storage areas. The sheer volume of data makes it almost impossible for them to do that quickly.

Indeed, the only way to search for data at these volumes is using automated systems that can parse through systems, building an index of the location, relationships, and dependencies of data, and the way they are recorded in the databases – the metadata. The system can then be queried for information organizations need to make solid business decisions.

When it comes to data, the magic word today is “control;” data is the “new gold,” and getting control of it is essential for the welfare of any organization. The more you know, the better decisions you can make about sales, marketing, hiring, investments, and a thousand other issues. Failure to get accurate information for these decisions can lead to major problems for organizations.

But getting control over such large amounts of data is a challenge, and exacerbating the problem are issues in the integrity of the data itself, including in the metadata – the classifications used in databases and other storage areas. This is a chronic - and central - problem for many organizations, and one that by itself could seriously hamper their ability to even find data, much less control it.

How? Often, information in databases is classified or labeled differently. An organization might record information about a customer's location with a label called “location,” “address,” “city and state,” etc. Whatever search system that is implemented needs to take into account these issues. If an organization can’t get the names it uses for the same data straight, how can it hope to control it?

Getting control of data isn't just about making better business decisions. GDPR rules – which apply to any company that even peripherally works with European residents or entities, meaning just about everyone – require organizations to drop personal information on Europeans, such as e-mail addresses or purchase histories, on demand. In order to do that, organizations need to be able to track down the information in the many places it is stored – databases, backups, social media posts, etc. Failure to do so could cost the company in hefty fines.

Control of data, then, is crucial not only to an organization's growth, but also to preventing losses. Organizations need to ask themselves what the most effective way for them to locate and control their data is. It's a problem all over; according to a study by NewVantage Partners, 85% of companies are trying to be data-driven, but only 37% of that number say they’ve been successful.

The key to ensuring control of data starts with ensuring that the associated metadata is managed properly - that an organization is able to find all the data it needs, regardless of how it’s labeled in the database and storage systems.

Ensuring metadata consistency is among the most important aspects of data control - but it’s among the most difficult to institute. The manual methods used by BI teams to track down metadata are no match for the colossal amount of information that needs to be searched through.

Automated systems that discover and analyze metadata enable organizations to find out what they really have, and can be the best way to eliminate the time wasted and the frustration in manual and semi-manual search methods.

For reprint and licensing requests for this article, click here.