5 top trends that will impact big data and analytics
By now we were all supposed to be more connected, but instead we’re getting more fragmented and siloed.
“Likes” in social media polarize us, where algorithms favor inflammatory content, evoke stronger reactions and keep us hooked longer. We've seen fragmentation when it comes to local laws, regulations and privacy.
In the private sector, business schools, strategy heads and activist investors preach to divest anything that's not a core competency but in a fragmented world, with digital giants lurking around the corner, do we need to think different?
For regulations, business models, and data – which increasingly is the same thing - we can turn a fragmenting landscape into an opportunity.
But analysis isn’t enough. We need synthesis and analysis to connect distributed data to the analytic supply chain - with catalogues as the connective tissue. The tech is there today but it also needs to be followed by the right processes and people. Synthesis and analysis is critical to make use of pervasive data and facilitate the evolution towards what we call “laying the data mosaic.”
Below is a curated sampling of the top 5 trends I see being most important in the coming year.
- Big Data is Just Data. Next up – “Wide Data”
Big Data is a relative term, and a moving target. One way to define big data is if it’s beyond what you can achieve with your current technology. If you need to replace, or significantly invest in extra infrastructure to handle data amounts, then you have a big data challenge.
With infinitely scalable cloud storage, that shortcoming is gone. It’s easier now than ever to do in-database indexing and analytics, and we have tools to make sure data can be moved to the right place. The mysticism of data is gone - consolidation and the rapid demise of Hadoop distributors in 2019 is a signal of this shift.
The next focus area will be very distributed, or “wide data.” Data formats are becoming more varied and fragmented, and as a result different types of databases suitable for different flavors of data have more than doubled – from 162 in 2013, to 342 in 2019. Combinations of wide data “eat big data for breakfast” and those companies that can achieve synthesis of these fragmented and varied data sources will gain an advantage.
- DataOps + Analytic Self-Service Brings Data Agility Through-out the Organization
Self-service analytics has been on the agenda for a long time, and has brought answers closer to the business users, enabled by “modern BI” technology. That same agility hasn’t happened on the data management side – until now.
“DataOps” has come onto the scene as an automated, process-oriented methodology aimed at improving the quality and reducing the cycle time of data management for analytics. It focuses on continuous delivery and does this by leveraging on-demand IT resources and automating test and deployment of data. Technology like real-time data integration, change data capture (CDC) and streaming data pipelines are the enablers.
Through DataOps, 80% of core data can be delivered in a systematic way to business users, with self-service data preparation as a standalone area needed in fewer situations. With DataOps on the operational side, and analytic self-service on the business user side, fluidity across the whole information value chain is achieved, connecting synthesis with analysis.
- Active Metadata Catalogues - the Connective Tissue for Data and Analytics
Demand for data catalogues is soaring as organizations continue to struggle with finding, inventorying and synthesizing vastly distributed and diverse data assets. In 2020, we’ll see more AI infused metadata catalogues that will help shift this gargantuan task from manual and passive to active, adaptive and changing. This will be the connective tissue and governance for the agility that DataOps and self-service analytics provides.
Active metadata catalogues also include information personalization, which is an essential component for relevant insights generation and tailoring content. But for this to happen, a catalogue also needs to work not just “inside” one analytical tool, but incorporating the fragmented estate of tools that most organizations have.
- Data Literacy as a Service
Connecting synthesis and analysis to form an inclusive system will help drive data usage, but no data and analytic technology or process in the world can function if people aren’t on board. And dropping tools on users and hoping for the best is no longer enough.
A critical component for overcoming industry standard 35% analytics adoption rates is to help people become confident in reading, working with, analyzing and communicating with data. In 2020, companies expect data literacy to scale, and want to partner with vendors on this journey. This is achieved through a combined software, education and support partnership – as a service – with outcomes in mind.
The goal could be to drive adoption to 100%, helping combine DataOps with self-service analytics, or to make data part of every decision. For this to be effective, one needs to self-diagnose where the organization is and where it wants to get to, and then symbiotically work out how those outcomes can be achieved.
- “Shazaming” Data, and Computer/Human Interactions
The effects of data analysis on vast amounts of data have now reached a tipping point, bringing us landmark achievements. We all know Shazam, the famous musical service where you can record sound and get info about the identified song. More recently, this has been expanded to more use cases, such as clothes where you shop simply by analyzing a photo, and identifying plants or animals.
In 2020, we’ll see more use-cases for “shazaming” data in the enterprise, e.g. pointing to a data-source and getting telemetry such as where it comes from, who is using it, what the data quality is, and how much of the data has changed today. Algorithms will help analytic systems fingerprint the data, find anomalies and insights, and suggest new data that should be analyzed with it. This will make data and analytics leaner and enable us to consume the right data at the right time.
We will see this combined with breakthroughs in interacting with data – going beyond search, dashboards and visualization. Increasingly we’ll be able to interact sensorally through movements and expressions, and even with the mind. Facebook’s recent buy of CTRL Labs – a mindreading wristband, and Elon Musk’s Neuralink project, are early signals of what’s to come.
In 2020, some of these breakthrough innovations will begin to change the experience of how we interact with data. This holds great human benefits for all, but can also be used for ill, and must be used responsibly.
Turn fragmentation to your advantage, by connecting synthesis and analysis – to form a dynamic system. DataOps and Self Service will be the process and method. Data Literacy and Ethics will guide people to do the right thing. Innovative technologies powered by AI will facilitate through-out the entire chain to enhance and accelerate data use.
These trends form tiles for laying a data mosaic in a complex fragmented world, making the use of data pervasive across the enterprise and ushering us into the next phase of success in the digital age.