© 2019 SourceMedia. All rights reserved.

Increased data volumes, variety and sources feed rise of data lakes

More organizations are deploying data lakes as part of their data management strategies—and the cloud is playing an increasingly important role in these endeavors.

The global data lakes market is expected to see a compound annual growth rate (CAGR) of 28 percent in the coming years, reaching $12.01 billion by 2024, according to a recent report by Advance Market Analytics.

Among the key market drivers are the rising volumes of a variety from a growing number of sources, and increased implementation of cloud-based software and services, according to the report.

data lake growth.jpg
Coaxial cables feed into a server inside a comms room at an office in London, U.K., on Friday, Oct. 16, 2015. A group of Russian hackers infiltrated the servers of Dow Jones & Co., owner of the Wall Street Journal and several other news publications, and stole information to trade on before it became public, according to four people familiar with the matter. Photographer: Chris Ratcliffe/Bloomberg

Many enterprises have gained maturity with using data lakes, and are now contemplating how to modernize their data lake infrastructure. There are typically four directions companies are contemplating as they look to modernize data lake platforms, according to Philip Russom, director of research for data management at TDWI, an organization that provides technical education and research in a variety of data management areas.

One is to keep using the whole Hadoop stack but migrate the data lake's data from on premises systems to the cloud. Another is to replace the Hadoop Distributed File System (HDFS) with cloud-based storage. A third approach is to start over in the cloud with data platforms designed specifically for the cloud. And a fourth is to go hybrid and/or virtual by distributing the data lake across multiple platforms.

For reprint and licensing requests for this article, click here.