The conflict between data science and cybersecurity

Register now

Big data analytics, machine learning and predictive analytics is supposed to be the panacea that will allow businesses and organizations to solve all their problems. The promise is that given access to large swath of data, individuals in an organization can use new and interesting ways to solve complex problems.

When it comes to big data and data mining, the more data you have the more accurate your analytics.

The problem with this is that it assumes everyone can (and more importantly should) have access to all the information in an organization. In fact, security and privacy concerns means that the exact opposite is true.

Cybersecurity is in direct conflict with the basis tenants of data analysis, especially big data analysis, predictive analytics and data mining. Data analysis, especially big data, is about opening access to a lot of the data in an organization to find new some interesting ways to solve business problems. For example, giving a marketing team information that they typically don’t use so they can find new and interesting ways of positioning your products or do some predictive analytics for sales forecasting.

Data Governance and security is about locking down access to data so that only the individuals who should have access get it. It’s about granting the least privileged access and pruning open access to data. This is becoming all too important these days as we see massive breaches where customer or employee data is lost and companies face, at minimum, reputational losses. At worse, they see massive fines, or loss of profit.

It’s not just for data breaches, regulatory requirements also play a big role is who should have access to what data. Across the EU we are seeing regulations that allow customers to be forgotten. This means you must have a complete understanding of where your sensitive data lives and have the ability to track down and delete all instances of a single customer’s information.

These examples highlight the need to protect data despite the desire to allow more people to have access for analysis. So, these days, it becomes more and more important to find the right balance between opening access to data for analytics while ensuring you are protected against data loss and security breaches.

It would be easy to say just lock everything down but that’s no longer realistic. The organization that can better use their existing data gains a huge competitive advantage. Data analysis is critical across all industries and departments within organizations of all sizes.

If you think about it, data analysis is at the core of most jobs these days. It’s rare to find a job that doesn’t involve some form of data analysis. This is not limited to only white collar jobs. Organizations are seeing the value of making informed decisions by taking data that in the past used to collect dust in file cabinets.

Now this electronic data is the raw material for doing deep data analysis and predictive analytics. It’s not just about correlating which ad led to more sales. It’s taking other metrics like the weather, time of the year, which political party is in office and sentiment on Twitter to predict the sale of hammers in Florida.

It’s going to be interesting to see how these two conflicting ideas will play out in the next 5 to 10 years. Understanding what data they have, who has access to that data and where the sensitive and personal identifiable information sits in an organization is and will continue to be one of the most important things a company will do protect itself.

While organizations try to protect their customers and employees, they also need to open access to data so they can do the kinds of analytics that will give them a competitive advantage. It’s imperative that they strike the right balance between security and analytics. The organizations that will find this right balance will undoubtedly be very successful.

For reprint and licensing requests for this article, click here.