A growing number of companies are exploring data lakes, but many struggle with how to turn raw material into actionable insights, according to Rich Dill, an enterprise solution architect at SnapLogic. Dill spoke with Information Management at the recent Strata & Hadoop World conference in New York about the implications of that, as well as what attendees were seeking at the show.
Information Management: What are the most common themes that you are hearing among conference participants?
Rich Dill: “There seemed to be two different groups at the show: the companies that are using Hadoop today and are looking for ways to get more out of the stack, and the late adopters who are still trying to figure out what Hadoop can do for their companies.
“The ingestion of data into Hadoop has largely been solved, but a big challenge that I continue to hear from enterprises is how to transform and then get insights OUT of Hadoop. To this end, we’re seeing more interest in cloud-based Hadoop deployments with a Spark, Presto, or another compute engine sitting on top of it.”
IM: What does your company view as the top issues or challenges with regard to data in 2016?
Dill: “The technology is still maturing and processes have not been well defined to deal with the dynamic unstructured nature of the data lake. This will be a key challenge for enterprises seeking to implement a data lake. The traditional data warehouse was a very structured environment and ideal for businesses who knew the questions to ask the data.
“The data lake is ideal for a new class of analysis, however, that could eventually answer the questions that the business didn’t know to ask in the first place. But as enterprises start down the path to a data lake environment, they lack a way to turn the raw material into something actionable. We’ll see analytics become the killer application for the data lake – visualizations, heat maps and trend lines.”
IM: How do these themes and challenges relate to our company’s market strategy this year?
Dill: “For SnapLogic, our aim is to be an agnostic layer that is the foundation for any big data sources and applications to integrate. Implementing a data lake is just one step in a data management strategy, but the challenge to use the data across applications and tools remains – that’s where SnapLogic comes in.”
Register or login for access to this item and much more
All Information Management content is archived after seven days.
Community members receive:
- All recent and archived articles
- Conference offers and updates
- A full menu of enewsletter options
- Web seminars, white papers, ebooks
Already have an account? Log In
Don't have an account? Register for Free Unlimited Access