Is Data Analytics Falling Victim to Its Own Complexity?
Data professionals and IT leaders are desperate to do more with data this year, but for many organizations, that is a tall order. The problems often fall with the growing complexity of data, and data analytics tools.
According to Jon Bock, vice president at Snowflake Computing, these challenges are forcing many organizations to reassess their data analytics strategies, available skill sets, and project goals. Information Management caught up with Bock at the Strata & Hadoop World conference in San Jose, CA, recently and asked him what these trends all mean.
Information Management: What are the most common themes that you heard among conference attendees and how do those themes align with what you expected?
Jon Bock: We heard a big shift from a focus just on the technology to a focus on getting projects delivered. In past years we heard people talk a lot about some new technology, feature or component—what was the new announcement, new Apache project, new framework, new management tool, etc.
Although those conversations are still happening, as people work to take technology that they’d been experimenting with or deployed for a small number of experts and now apply it to solutions that a much larger group relies on, their focus has shifted to the more pragmatic concerns about how to complete projects.
This shift in focus, from experimentation to broader deployment and integration, leads people to ask the tough questions that they need to ask—questions about how much effort it’s going to take to tune and manage their solution, questions about how they’re going to scale their solution, questions about the suitability of the technology and it’s features to their data questions, and more.
IM: What are the most common data challenges that attendees are facing?
JB: They’re finding themselves forced to do a lot of custom, manual work to get everything working together. That’s making it extremely difficult for them to figure out how to integrate and scale data analytics to support more of the company.
The other thing we saw was people pushing the boundaries of the technologies they’re using for storing and analyzing data and hitting challenges that are forcing them to step back and assess what each technology is and is not good for. We first saw that happen with Hadoop, and we’re seeing the same with Spark—people started trying to use them in places beyond what they were designed for and maybe not the best fit for.
IM: What are the most surprising things that you heard from attendees regarding their data management initiatives?
JB: That they’re willing to make significant changes to the technologies they’re using. The conventional wisdom is that once a platform for data processing was deployed, it took at least years and maybe even decades to change it. Instead, we heard a lot of people talk about looking at new options and being willing to make a change if it would help them meet their goals.
It’s clear that the amount of innovation in data analytics that we’ve seen over the last decade has encouraged people to think more aggressively about trying new technologies, while cloud and SaaS have made it easier to do that.
IM: How do these themes and challenges relate to your company’s market strategy this year?
JB: Snowflake is focusing on giving people a solution that addresses the limitations people are seeing in both traditional and new database and data platform offerings while eliminating the complexity that’s getting in the way of expanding data analytics initiatives.
One part of that means helping people realize that they don’t have to be forced to use platforms like Hadoop and Spark for things they weren’t designed to handle. Another part means taking manual complexity and replacing it with intelligent self-managing software, and finally making it easy for people to leverage their existing skills, tools, and process with our disruptively innovative data platform.
IM: What does your company view as the top issues or challenges with regard to data?
JB: We’re at a tipping point where all of the different technologies that have emerged over the last decade have added so much sprawling complexity that now as people look to expand their data analytics projects, they’re finding that it’s difficult or impossible to do that because of the complexity.
You can see the impact of that in multiple ways—in the heavy demand for specialized data science programming and operations skills; in the number of new management tools that are popping up; and more. The reality of this complexity is forcing a reckoning for customers—what technologies are worth the complexity, and what can customers do to minimize the complexity.