More than one recent big data study has labeled 2016 as the year of action – a time to actually act on the insights that data analytics can provide the organization. That theme was echoed at the recent Strata & Hadoop World conference in San Jose, CA.
Information Management spoke with Ash Parikh, vice president of data integration, data security and big data at Informatica for his take on what this new focus means.
Information Management: What are the most common themes that you are hearing and how do those themes align with what you expected?
Ash Parikh: The Big Data space has been evolving gradually over the last few years reflecting the level of maturity in real customer projects. A few years ago, if you were to attend any of the major Big Data events, road shows or conferences, you would walk away extremely excited about the buzz, the giveaways, the myriad of technologies mushrooming by the minute, and the newness of the space in general.
Today, I can say that we are starting to see a shift – there are less giveaways at vendor booths for starters, more awareness in general that it is a nightmare to keep up with all the new technologies being introduced and ones that are fast becoming outdated in such a short time.
Additionally, there is more discussion about how to deliver value from all the investments around Big Data – how can I increase campaign effectiveness, how can I ensure improved healthcare outcomes or how can I reduce the risk of fraud? The fact that there are now more sessions and articles and blogs and Tweets about how to not turn a data lake into a data swamp, is evidence in itself that companies are starting to ask some hard questions.
IM: What are the most common challenges that organizations are facing with regard to data management and data analytics?
AP: Firstly, the audience is not even fully aware that there is a problem. According to Gartner and other leading industry analyst firms, over 70% of Big Data projects either fail entirely or struggle to go beyond experimentation because of a lack of due diligence up front to data management of Big Data.
It is generally felt that it is enough to simply spin up a Hadoop cluster, dump all types of data into it at scale, create a sandbox, and experiment - and then lo and behold— almost magically— those golden needles in the haystack (read that as new and unique insights) will reveal themselves. Typically, all this is done by bringing together a host of open source technologies and throwing hand-coding at the problem.
If this effort needs to scale, and more importantly, deliver trusted and timely insights, customers typically translate this to more hand coding. Simply throwing bodies at the problem is not scalable and won't solve the complex data management issues with Big Data. There are serious data integration, data governance and data security issues that need to be handled at scale. Manual goes only so far.
IM: What are the most surprising things that you are hearing?
AP: Some of the most surprising things we have observed from our interactions with customers is that they don’t even know that they have a problem until it is too late. And when they do, they go into reactive mode and try to address complex data management issues with Big Data with hand coding.
The other issue that has sprung up over the last year or so is that there is a growing belief that a stand-alone data preparation tool is enough to handle these issues. Let’s put this down to hype as well. A few years ago we saw something similar happen in the business intelligence space – where the new age self-service business intelligence tools promised to deliver their users from all kinds of data management challenges. But very quickly people realized that these tools needed a solid data management platform underneath to truly deliver those insights to those beautiful dashboards.
It’s the same case with stand-alone data preparation tools, that can go only so far with addressing the serious data integration, data governance and data security issues that need to be handled at scale in big data projects.
IM: How do these themes and challenges relate to your company’s market strategy this year?
AP: Informatica has over 20 years of experience, singularly focused on handling complex data management issues for their customers. Nobody knows data management like Informatica – whether the data is small or Big Data.
We work with over 5000 customers who trust us with their data – something that they believe is the lifeblood for their organizations. We have brought together everything data - acquisition, ingestion, transformation, security, matching and linking, governance and certification, preparation and delivery to those dashboards into a purpose-built, metadata driven and integrated platform. This eliminates the need for hand coding, facilitates automation, encourages reuse, and nullifies the nightmare of dealing with stand-alone solutions that cannot deliver the value a customer is looking for in Big Data projects.
Informatica recently launched the first Big Data Management platform in Fall 2015 to deliver the operational agility needed in these projects, and took this one step further by introducing the Intelligent Data Lake in Spring 2016 for business users, built on a Big Data management foundation for managed business self-service.
IM: What does your company view as the top data issues or challenges in 2016?
AP: The world of data management and analytics is not new – but it is definitely not stagnant. There is constant change with regard to new technologies and new approaches being discovered every day, for delivering more business value and better business insights. This is indeed exciting, and fuels innovation. However, what a customer needs to guard against in such a dynamic environment is hype.
Today, more than ever, there is a need for level-headedness and pragmatic thinking. There is a need to step back, breathe and smile when an article or blog or webinar or Tweet boldly announces that “data warehouses are dead” or “Hadoop is all that you need” or “data preparation will solve all your big data management issues,” as nothing can be further from the truth.
Ask yourselves the following questions:
Do you need to know what you sold the customer in the past?
And do you also need to know what your customer might want to buy in the future?
If the answer to both those questions is resounding yes, then you will need a data warehouse as well as a data lake to get a holistic answer, and get your hands on that golden needle in the haystack. And if that needle is indeed golden, then it is a safe bet that you will need to be proactive and up-front in doing all the data management due diligence around your all you data – big or small.
Register or login for access to this item and much more
All Information Management content is archived after seven days.
Community members receive:
- All recent and archived articles
- Conference offers and updates
- A full menu of enewsletter options
- Web seminars, white papers, ebooks
Already have an account? Log In
Don't have an account? Register for Free Unlimited Access