4 top trends for big data analytics in 2020
2019 has been, without a doubt, one of the most eventful years for the data industry across the board. We have witnessed groundbreaking cloud innovations and the rise of better ways to collect, access, and analyze big data. All these changes are rapidly improving e the amount of value enterprises are getting from their data.
In 2020, enterprises will evolve in how they approach their data maturity and strategize their cloud investments. The new year will bring compelling reasons to focus on modern cloud data lakes; increased efficiency of cloud services to remarkably reduce cloud computing costs; easier ways to make IoT data a valuable business asset; and open source innovations to accelerate analytics results.
With that in mind, here are four top big data predictions that enterprises should keep an eye on for 2020.
Cloud data warehouses turn out to be a big data detour
Given the tremendous cost and complexity associated with traditional on-premise data warehouses, it wasn’t surprising that a new generation of cloud-native enterprise data warehouse emerged. But savvy enterprises have figured out that cloud data warehouses are just a better implementation of a legacy architecture, and so they’re avoiding the detour and moving directly to a next-generation architecture built around cloud data lakes.
In this new architecture data doesn’t get moved or copied, there is no data warehouse, and no associated ETL, cubes, or other workarounds. I predict 75% of the global 2000 will be in production or in pilot with a cloud data lake in 2020, using multiple best-of breed engines for different use cases across data science, data pipelines, BI, and interactive/ad-hoc analysis.
Enterprises say goodbye to performance benchmarks, hello to efficiency benchmarks
Escalating public cloud costs have forced enterprises to re-prioritize the evaluation criteria for their cloud services, with higher efficiency and lower costs now front and center. The highly elastic nature of the public cloud means that cloud services can (but don’t always) release resources when not in use. And services which deliver the same unit of work with higher performance are in effect more efficient and cost less.
In the on-premises world of over-provisioned assets such gains are hard to reclaim. But in the public cloud time really is money. This has created a new battleground where cloud services are competing on the dimension of service efficiency to achieve the lowest cost per compute, and 2020 will see that battle heat up.
IoT data finally becomes queryable
The explosion of IoT devices has created a flood of data typically landing in data lake storage such as AWS S3 and Microsoft ADLS as the system of record. But while capturing and storing IoT data is easy, the semi-structured nature of IoT data makes it difficult to process and use: data engineers are forced to build and maintain complex, and often brittle, data pipelines to enrich IoT data, add context to it, and accelerate it.
Software AG has stepped in to tackle this problem head on with their Cumulocity IoT Data Hub, and I predict that in 2020, IoT data will be directly queryable at high performance via business intelligence, self-service analytic, machine learning, or SQL-based tools.
The rise of data microservices for bulk analytics
Traditional operational microservices have been designed and optimized for processing small numbers of records, primarily due to bandwidth constraints with existing protocols and transports. I predict a new category of data microservices focused on bulk analytical operations with high volumes of records, and in turn these data microservices will enable loosely coupled analytical architectures which can evolve much faster than traditional monolithic analytical architectures.