Resource Center

40 Vendors We're Watching 2012: Big Data

To learn more about how the companies were selected and to view the list of all 40 vendors, please click here.

Cloudant Cloudant

What: Managed cloud database
Why: NoSQL that’s distributed, replicated, “managed by experts,” REST API full-text search and analytics on globally scalable ApacheCouchDB. Besides migrating your marketing analytics database to these MIT guys, they sound like the type you’d trust to feed your dog while you’re away.
Where: Boston, MA
Of Note: Answers the question, “so, where do you go from here?” for three MIT physicists who held a stint managing data at the Large Hadron Collider. Scale is what they live for. NoSQL data layer for Windows Azure and a gig hosting online game player data.
www.cloudant.com

Cloudera Cloudera

What: Data storage and processing services on Apache Hadoop
Why: For big data users that are ready for enterprise-level security, integration and infrastructure and a variety of subscription-based services. Backers have made it rain in this cloud for the last three years.
Where: Palo Alto, CA
Of Note: Founder and CEO Mike Olson is among the most visible and outspoken visionaries and advocates of Hadoop. Cloudera’s platform was part of the big data appliance Oracle launch this year, and has been seen hanging out with Pentaho, HP and IBM. Has its own certification “university” for Hadoop training. Customers include eBay, Groupon, Morgan Stanley, Nokia and Qualcomm.
www.cloudera.com

DataStax DataStax

What: NoSQL Cassandra Hadoop
Why: Okay, we’re sensing a theme. DataStax brings an array of open source products to Cassandra, a scalable NoSQL database for real-time big data workloads across multiple nodes. With a workhorse of a download platform (you pay for support and consulting), our source tells us DataStax will become a household name in BI. (Is there such a thing as a household name in BI?)
Where: San Mateo, CA
Of Note: Customers include lots of service providers including eBay, GoDaddy, LivePerson, and Netflix uses Apache Cassandra to minimize downtime and outages.
www.datastax.com

Advertisement
GridGain GridGain

What: In-memory Java middleware for big data
Why: In-Memory = real time. This high performance Java middleware can start with less than 10 lines of code (they print the world’s shortest MapReduce app on the back of their business cards) to build enterprise e-commerce platforms, hyperlocal advertising, global gaming platforms and more.
Where: Foster City, CA
Of Note: GridGain carries clout, counting some of the largest companies in the world as customers, such as Apple, Canon and Sony.
www.gridgain.com

Hadapt Hadapt

What: Big data analytics
Why: Combining relational database technology with Hadoop into a single system, Hadapt produces cloud-based big data analytics. Data stored in Hadapt can be accessed using existing SQL-based tools and SQL queries can be performed significantly faster than using Hadoop+Hive.
Where: Cambridge, MA
Of Note: Hadapt made Gartner’s 2012 Cool Vendors in Information Infrastructure and Big Data. MassTLC named Hadapt one of the Innovative Technology of the Year for Big Data.
www.hadapt.com

Hortonworks Hortonworks

What: Enterprise big data platform on open source Apache Hadoop
Why: Big in the big data space right off the bat, Hortonworks with engineers and financing from Yahoo!, you could say these folks wrote the book on enterprise use of Hadoop because they did, a lot of it anyway. With a year under their belt, Hortonworks gets high marks from analysts on cluster monitoring and metadata sharing across systems.
Where: Sunnyvale, CA
Of Note: In a tight big data market that often confuses the C-suite set, Hortonworks turned some heads at Hadoop World and Strata conferences in the past year; high-profile rollouts and Yahoo! connections no doubt aided from big data relationships with Teradata, Microsoft and others. Like other providers in the competitive democracy that is Hadoop, Hortonworks has its own certification courses and a legion of developers in its virtual sandbox.
hortonworks.com

MapR MapR

What: Enterprise-scale Apache Hadoop distribution
Why: Claims “no single point of failure” or downtime and full data protection. No shortage of community coding language contributions from MapR, and it’s winning commercial converts of late on those SLAs. It’s early to name any knock-out winners in enterprise Hadoop, but clearly stated use cases across multiple industries and a mantra of reliability can’t hurt their case.
Where: San Jose, CA
Of Note: Two editions, turnkey solutions for private and multitenant, can run on AWS and Google Compute Engine; 451 Group called them the “clear choice” for Hadoop.
www.mapr.com

Advertisement
Skytree Skytree

What: Machine learning for big data analytics
Why: Machine learning is one clear path to big data processing. Use cases for clustering/segmentation, outliers, predictive analytics, similarity search. Automation and low entry point make it a low-risk bet.
Where: San Jose, CA
Of Note: More than four decades of scalable machine learning experience at environments including the Large Hadron Collider, NASA and the Sloan Digital Sky Survey. Advisory Board includes Pat Hanrahan (Stanford and co-founder of Pixar & Tableau) and Michael Jordan (UC Berkeley’s top machine learning expert).
Note: This entry was corrected 10/9 and eliminated references to natural language processing, which Skytree presently does not offer.
www.skytreecorp.com

Splunk Splunk

What: IT and machine analytics
Why: Because machine data is churning and we’re still figuring out what we need and what to do with it. Splunk’s ROI comes from reducing IT downtime, cutting legacy cost, supporting revenue-generating IT, reducing fraud, enforcing SLAs with business and compliance risk insights.
Where: San Francisco, CA
Of Note: Management team built from potential machine life forms recruited at Disney, Apple, Oracle, Microsoft, Autodesk, Infoseek, Informix and SAP. Seriously though, a shelf full of awards and a Best Place to Work in the Bay Area award from the San Francisco Business Times and San Jose/Silicon Valley Business Journal.
www.splunk.com

For more: For more:

  • View all the Analytics/Visualization vendors in our list here.
  • View all the BI vendors in our list here.
  • View all the Database vendors in our list here.
  • View all the Integration/Governance vendors in our list here.
And click here to read the full 40 Vendors We're Watching: 2012.

Information Management’s “40 Vendors We’re Watching” 2012 is a list of up and coming vendors on our radar that are doing their part to shape the groundswell in information management technology in the 21st Century.

As our editors and advisers reviewed the strengths and of each of these companies, we determined that five main themes were apparent: analytics/visualization, big data, business intelligence, database, integration/governance. Our big data category features startup vendors developing open source frameworks for working with large volumes of data for processing and analysis. [Note: Some vendors cross multiple categories.]

 

Please note you must now log in with your email address and password.