A growing number of organizations are adopting big data tools such as Apache Spark to help manage their growing data assets. But will Spark adoption catch up with the popularity of Hadoop?
That question seems to have been on the mind of many attendees at the recent Strata & Hadoop World conference in San Jose, CA. Information Management spoke with Eric Tilenius, CEO at BlueTalon, at the event for his take on what attendees were most thinking.
Information Management: What are the top themes and issues you heard and how do they align with what you expected?
Eric Tilenius: The obvious elephant in the room is Spark and whether it will replace Hadoop. A common conversation topic from Strata attendees was around the concept of how big data projects have evolved beyond Hadoop to now encompass a variety of tools from Hadoop, Spark, Kafka, data sources hosted in the cloud, SQL-based and non-SQL based repositories with multiple access methods.
IM: What are the most common data management and analytics challenges that attendees are facing?
ET: Data governance – from creation, to transformation and consumption by end users. As big data projects mature, organizations collect and move more data to and from their Hadoop environment from a variety of data sources. This presents a significant challenge in terms of who owns the data, the quality of the data and who has the rights to see it once it has been moved.
IM: Were there any surprising things that you heard from attendees?
ET: The majority of big data projects are no longer made with the notion of Hadoop as the main data platform. In the past, Big Data and Hadoop were used interchangeably, however this is no longer the case. The majority of enterprise data initiatives today rely on Hadoop, relational databases, and new forms of data management including NoSQL technologies.
The cloud is also an integral component of data initiatives with many organizations taking advantage of the simplicity, flexibility and acceleration provided by cloud platforms to deploy Hadoop.
IM: How do these themes and challenges relate to your company’s market strategy this year?
ET: BlueTalon is focused on helping companies gain full control and visibility over data access at the data layer, regardless of which data platforms are in use.
We see a lot of companies that started with a few Hadoop clusters and are now connecting their data initiatives to multiple other data sources. As they expand their footprint, they are now turning to BlueTalon to help them with enforcing consistent data security and access control at a granular level across multiple data platforms. Being able to monitor who has access to what data to enforce security controls and ensure compliance is becoming paramount.
IM. What are the top data management and data analytics challenges in 2016?
ET: The data platform landscape is becoming more complex. A couple of years ago, data analytics projects were vertically integrated: analytics tools sitting on top of a major data warehouse supported by one database. Today, as organizations need to drive more value out of the data they collect, they’ve diversified their data environment.
This is a major challenge for security staff working on data projects as security controls and data protection get fragmented at different layers, into multiple applications and systems. Most CISOs have lost control and visibility of who has accessed which data. This exposes companies to a higher risk of data breaches and a lack of compliance.
Register or login for access to this item and much more
All Information Management content is archived after seven days.
Community members receive:
- All recent and archived articles
- Conference offers and updates
- A full menu of enewsletter options
- Web seminars, white papers, ebooks
Already have an account? Log In
Don't have an account? Register for Free Unlimited Access