A thought occurred to me as I made my way around Enzee Universe 2011, the IBM Netezza user conference, in Boston last week: the participants here are actually the parents of the “kids” I met at the Strata data science meet up in February.
Or at least so it seemed. Much more gray hair and less body art than in Santa Clara. Good for me though. I wasn't nearly as conspicuous.
I was the guest of Revolution Analytics, the commercial R company, who was at Enzee to introduce the integration of its flagship Enterprise R product with the IBM Netezza data warehouse appliance. “Revolution R Enterprise ‘plugs-in’ to IBM Netezza Analytics (that ships with each appliance) to consolidate all analytics activity into a single appliance. With Revolution R Enterprise for IBM Netezza, advanced R computations are available for rapid analysis of hundreds of terabyte-class data volumes – and can deliver 10-100x performance improvements at a fraction of the cost compared to traditional analytics vendors.”
Still, it took me a while to get comfortable with the 1000+ participants. Why? I'm an open source BI guy now, and Netezza, a DW-analytics appliance with an outstanding reputation, is anything but. In today's BI world, though, it's critical for open source to play nice with proprietary. Fortunately, the zealotry seems to be waning as both sides look to turn a profit.
Truth be told, there's a lot of good that can come to commercial open source Revolution Analytics from this integration. First, of course, is the potential order of magnitude performance improvements on tasks running RA's Enterprise R from the in-database, massively parallel processing of the Netezza appliance.
Second, simply being part of the IBM product portfolio is big plus. IBM, with its Smarter Planet positioning, is the unquestioned leader in driving business decision-making through analytics. And every indication is that big data/analytics will continue to grow at a feverish pace. IBM's in a desirable spot.
Third, Netezza is an acknowledged leader in data warehouse/analytic appliances and has an unparalleled record of performance innovation. Their notion of in-database-analytics is fast becoming a standard in the big quantitative world. RA statistical platform competitor SAS and IBM's own SPSS are already integrated with the Netezza appliance. And visualization heavyweights Spotfire and Tableau are in progress. Pretty soon it might be that all important analytics technologies run in-database with Netezza. So RA should now enjoy an early adopter tailwind from one of the most important technologies in the corporate, big data, BI world today.
Of course, having revered R as a component of its analytics appliance should bestow benefits on IBM-Netezza as well. In several of the technical talks I participated in, the Netezza speakers touted the Revolution R integration highly, acknowledging that R will soon become lingua franca of the commercial statistical world, just as it currently is in academia. They also stressed the importance of having a commercial vendor stewarding the R platform. One speaker confided that Netezza wants to be the beneficiary when the current cohort of statistics students, who overwhelmingly use R, migrates to the business world. An R aficionado, I must admit I was quite impressed – and gratified – with that thinking.
I saw the power of this IBM-Netezza draft first hand at Enzee. The Revolution booth was inundated with curious participants during each of the breakfast and midday vendor meeting sessions. Business cards were changing hands and conference cards were being scanned at a furious pace. The Revolution “booth people” had to ration their time with the eager prospects, many of the whom seemed quite serious: you'll get back to me next week, right? I watched nearby, mouth agape.
When I wasn't hanging around thinking big thoughts, I managed to take in half a dozen or so presentations, some of which were meaty/informative – others, little more than sales and marketing blather.
IBM Group VP Steve Mills kicked off the conference citing a predictive analytics call to action by business leaders in a 2010 MIT-IBM study demonstrating superior performance by analytics companies. Alliterating smart, speed, simplicity, scalability and standards, Mills made the case for the Netezza appliance as an efficient means for businesses to create those differentiating analytics capabilities. His themes of ROI, agility and time to value would be repeated often at the conference.
Usama Fayyad, Open Insights CEO and Former Chief Data Officer of Yahoo!, is ubiquitous on the analytics conference circuit. I've seen him recently at Predictive Analytics World, Strata, and now Enzee. I love what Fayyad has to say about both the technology and analytics of big data, though he can be hard to follow. His presentations would be better served if he were given more time or were to cut back on the volume of material. At this session, I particularly liked his discussion of a forecasting initiative at Daimler-Chrysler.
Fayyad, RA CEO Norman Nie, and analyst Sean Rogers later led a lively discussion on the merits of big data. Both Fayyad and Nie noted the importance of large data sets when working with rare events and skewed populations. And both seemed to tacitly reject traditional statistics with its emphasis on significance and linear models that limit the numbers of explanatory variables for prediction. Rogers' promotion of iterative exploration was seconded by both Nie and Fayyad, who also emphasized visualization and the artistry of analytic exploration. In the Q&A session that followed, a Forrester analyst noted the ascendance of open source platforms Hadoop and R for handling big data.
Bill Zanine painted a comprehensible picture of Netezza Analytics. With this architecture, analytics is moved to the database instead of the traditional database to analytics, ideally providing efficiencies and boosting performance. With integration to Java, C++, Python, Hadoop, R, SPSS, SAS, Spotfire, Tableau – and other BI, spatial, matrix and analytics libraries – there's no shortage of choices for Netezza BI analysts and modelers. Thomas Dinsmore and John Rollins later gave the consultants' take on the Netezza “model”, touting pervasive analytics on demand, with the capability to provide benefit for all analytics constituents – from experts to executives.
IDC analyst Dave Vesset closed the conference on Wednesday with his talk, Big Data Small Decisions. Just as Steve Mills had his S's, Vesset note the V's of business analytics – Volume, Variety, Velocity and Value. He conceptualized decision management as the cross classification of decision type – strategic, operational, and tactical – and decision attributes – scope, number, degree of automation and collaboration. Vesset recommends that companies looking to compete on analytics first perform an enterprise analytics strategy assessment. He then promotes project-based technology purchases and segmenting users by type, with awareness that one tool doesn't fit all.
All in all, a very educational and productive two days in Boston. One suggestion for next year: find a conference hotel in the Back Bay area near Boylston St. – a much livelier section of town than convention center.