Increased data volumes, variety and sources feed rise of data lakes
More organizations are deploying data lakes as part of their data management strategies—and the cloud is playing an increasingly important role in these endeavors.
The global data lakes market is expected to see a compound annual growth rate (CAGR) of 28 percent in the coming years, reaching $12.01 billion by 2024, according to a recent report by Advance Market Analytics.
Among the key market drivers are the rising volumes of a variety from a growing number of sources, and increased implementation of cloud-based software and services, according to the report.
Many enterprises have gained maturity with using data lakes, and are now contemplating how to modernize their data lake infrastructure. There are typically four directions companies are contemplating as they look to modernize data lake platforms, according to Philip Russom, director of research for data management at TDWI, an organization that provides technical education and research in a variety of data management areas.
One is to keep using the whole Hadoop stack but migrate the data lake's data from on premises systems to the cloud. Another is to replace the Hadoop Distributed File System (HDFS) with cloud-based storage. A third approach is to start over in the cloud with data platforms designed specifically for the cloud. And a fourth is to go hybrid and/or virtual by distributing the data lake across multiple platforms.