Dive Into Open Source
The last five years have brought a fresh wave of thinking to the frontier of business intelligence. The advent of massively parallel processing, data appliances, in-memory databases and cloud computing are challenging old technology stereotypes. New demands to confront mission-critical instances of operational data, predictive analytics and unstructured content have arrived in response to a changing wave of business objectives. All this is happening in a lingering climate of cost and resource constraint. A challenge for many organizations is the question, “What can we do next?”
We might start with what we already know, because, amid change, old obstacles persist painfully and cost a lot of money to remedy. An expensive toolbelt of application resources for reporting, dashboards, integration, data movement and change data capture already hangs at the waist of many organizations. Other organizations have simply lacked the financial means to tackle desirable BI-related projects at all because of the high cost of entry.
In that light, an equally old resource, open source software, has spawned commercial startups that also want your money, but likely much less than the name-brand BI product vendors are asking upfront and overall.
The growing appeal of commercial open source vendors is in parts timing and maturity. What you’ll pay for and what you’ll get still has to do with your actual needs and your pocketbook, but we’ll stipulate that the point has been made and isn’t going away. While open source BI revenue remains in the shadow of proprietary commercial off the shelf (COTS) remedies, Gartner Inc. predicts the use of BI tools from open source providers will grow five-fold through 2010.
As for maturity across technologies that include databases, BI and analytics, IT developers are already well acquainted with open source. Forrester Research found that 80 percent of professional developers have already used open source software in at least part of their daily development activities.
For the nontechnical business sponsor looking into open source projects for the first time, some perspective is in order (see sidebar). And for all parties investigating commercial open source, a bit of research will help everyone understand terms such as open source, open core, free software, public domain and shareware (these links are possible starting points, not exhaustive. -ed). Navigating these terms is necessary for shaping your view of what you’re getting into with commercial open source software and how you can benefit.
We can provide a couple of use cases and advice to get you started. We’ll thumbnail how two companies use commercial open source products and end with some reflections on their experience.
Small Business, Broad BI
Peter Schmidt is the director of business intelligence at Centro, a 120-employee media services firm that connects advertising agencies with digital publishers. Across his 18 years in IT, Schmidt had worked on BI platforms using traditional COTS products for U.S. Cellular and others, and built a BI program from the ground up for United Airlines Loyalty Services, where he served as a director of enterprise architecture.
Upon arriving at the much smaller Centro two years ago, Schmidt got his first real taste of open source, but his transition was prepaved. Centro’s most-used application was already open source, a media-planning tool that’s been newly updated and released for online self-service and software as a service delivery at transis.com.
What Centro lacked was corporate visibility to its own data beyond antiquated spreadsheet reporting. After looking into several open source options, the company chose Pentaho’s enterprise suite. “I liked that it was one deep stack where they provide the data integration, the reporting, the OLAP analysis, the predictive mining and machine learning and a platform that glues it all together,” Schmidt says.
And it surely beat the exhaustive nine-month evaluation processes he’d seen at big companies. “About the worst thing that could happen is you picked the wrong thing and it doesn’t scale right away or you have to swap something out. We did our homework, became an enterprise subscription customer at a nominal fee and the rest is history.”
Centro’s BI now supports about 60 self-service reports and analytics for static reporting, parameterized reporting and OLAP dashboards. The main user base has been sales, sales support and sales research. New employees are coming on at Centro, and Schmidt sees no problem in scaling as he improves his services.
Schmidt figures self-service has saved Centro about $80,000 at an hourly IT service rate over the last 15 months based on the 700 to 1,000 data requests per month that used to involve IT gathering and manipulating data. It has freed resources and allowed IT to focus on its core role.
Centro’s goals and data sets were much smaller than those Schmidt ran into at United, though they might be comparable to a departmental project at a large company. Also, the system may not complete everyone’s wish list at Centro, but as new enterprise features arrive, like Pentaho’s “extremely slick” new OLAP analysis front end, Schmidt prefers to focus on the gains.
“I’m always saying, ‘Guess what guys, we didn’t shell out all these hundreds of thousands of dollars, and if it doesn’t do this little thing now, move on,’” says Schmidt. “You’re always trying to make things better, but if you’re smart and calculated, you want to use the parts that work really well for you.”
Big Business, Specialized BI
A different problem confronted Woody Christy, senior engineering product manager at a large U.S. cable television provider with tens of thousands of employees. Christy’s company (which we are not allowed to name directly) uses a variety of name brand BI tools for different purposes; Christy’s job is to manage information around video-on-demand products for bandwidth capacity planning or troubleshooting.
This work requires static reporting but leans more heavily on operational data in dashboard views of latency and capacity for customer demand on its video inventory. For Christy’s needs, standing capacity and troubleshooting need to be close to real time.
Christy chose Jaspersoft’s BI platform largely on the merit of its dashboards and drill-throughs, and he maintains a catalog of 75 or 80 reports for planning or technical issue follow-up. Large cable providers are also called multiple system operators (MSOs), meaning they own or operate a large number of cable systems or networks. “We can create dashboards for BI and reporting off of a database that goes through and collects data from different distributed systems, says Christy. “Then it comes back and correlates all the data in Jaspersoft using the presentation layer.”
It’s not that open source is the “secret sauce” in the dashboards Christy says are so useful that they provide his company competitive advantage. It’s more that the dashboard builder that comes in the enterprise edition of Jaspersoft builds these valuable views with little or no tinkering with the code. “Everything that’s displayed comes from the queries and reports we wrote, but the actual dashboard builder just built the display for us,” says Christy. “In that sense, Jasper is really good from a maturity perspective compared to Red Hat or JBoss.”
Scale hasn’t been an issue for Christy. On a very busy day, as many as 500 million events might surface in Jaspersoft reports, though usually no more than 20 people are logged into the system at a given time.
Time to deployment and cost were bigger drivers for Christy than the flexibility associated with open source. “You go down the path of writing your own system from open source and you’ll see that it just doesn’t make sense. And besides cost, I think the learning curve is shorter than it is for a Cognos or other product,” he says. “The reason we went with Jaspersoft was because it was just much more bang for the buck and much less overhead.”
Schmidt and Christy both use enterprise versions of commercial open source software and the enhanced features not included in the free open-core versions. Each is satisfied with their deployment and indicated they’d likely follow the same path again. Though they’ve never met, both raised similar issues likely to be confronted by organizations approaching such a strategy. We summed them up here and also asked Forrester Research analyst Jeffrey Hammond to comment on our observations.
Match business goals. The two deployment examples presented here are a tiny subset of use cases for BI and specific feature functionality. They also represent only two vendors – both platform providers – in a broad field that includes specialized products and services. Commercial open source development agendas and priorities vary and might not match your specific project needs. Schmidt was most benefited by the integrated suite of Pentaho, while Christy was drawn to operational capabilities of Jaspersoft.
You don’t want to go into this kind of adoption thinking about what you might get someday, Hammond says. The big qualifier for companies testing the open source experience is what he calls the “good enough” experience. “There generally isn’t a throat to choke unless you pay for some kind of a support contract. Otherwise, caveat emptor is the rule.”
Cost or flexibility – or both? Lower cost and the ability to manipulate source code are usually cited as the two top attractions of commercial open source products, but Schmidt and Christy both leaned to cost benefits by a wide margin.
“What you see first is that it’s not six figures for integration software, six figures for reporting licenses and six figures to get the underlying infrastructure together,” says Schmidt. “The mover at that level is, why not give it a try?”
When it comes to tweaking underlying source code, Schmidt and Christy take advantage, but they are just as happy to use enterprise software editions out of the box. “I like the ability to look at the source code, but I really like it when I don’t have to,” says Christy. “We can write custom renderings for graphs, which is nice, but I’m fine if it works the way it came.”
Lead with what works well. As mentioned, different products and platforms stress different kinds of functionality, much of which may be irrelevant. It’s another reason to focus on the business problem at hand. Both Schmidt and Christy strip down their branded versions of software and conceal features not suited to local skills – which vary widely across open source users.
“In the best case where you have everything you want, do you need everything you have?” asks Schmidt. “People are working all over to make open source products better, but a lot of the design tools aren’t something you’d sit in front of a user, so there’s still a kind of an IT ownership to the implementation, and sometimes it leads back to nitpicking over small issues of functionality.”
“Jaspersoft is competing with companies like Cognos with ad hoc reporting,” says Christy. “To be honest, that’s a feature people cry for, but you can’t put that [console] in front of common end users with hundreds of millions of rows underneath. I’ve intentionally hidden that so our end users can’t see it, but if you are serving MBAs in a financial market, you’d probably want it exposed.”
Investigate subscription models. Professional or enterprise server editions of commercial open source have subscription models that may be per user or by instance. Where small user numbers are involved, this may not matter much, but large communities or expansion plans could change the equation.
“Per-user models never work for us because I have no idea how many people log in at any given time,” says Christy. “We don’t track it that way, and I prefer a per instance or enterprise-wide license, which is also easier for us to audit.”
Understand documentation, community and support. Virtually all open source software involves communities of users who share ideas and fixes, and the best of these find their way into paid versions of software with the most attention. But, as Schmidt says, “Most of the time you’re asking a question because there is a piece that’s lacking.”
This is true, says Hammond, but at least Schmidt can see the conversation in the community, more than can be said for many proprietary products. Christy stresses the value of libraries of documentation that come with paid software editions.
“I’ve been active in communities, but I mostly haven’t had to go there,” he says. “The documentation on how to modify what we need has done the job for most projects, and it’s faster than going to the community.”
Watch the broader commercial market. Proprietary software vendors have no choice but to respond if they want a piece of the market now moving upstream with commercial open source providers. Still other models, including cloud-based SaaS BI from providers such as Birst are looking to attract the same crowd.
“Customers are making a choice based on the least expensive option to start with because they have a really low downside,” says Hammond. “And if that works, they’ll stay with it because it roughly meets their needs, and that blocks the commercial product.”
Based on his enterprise experience, Schmidt admits to being “really intrigued” by last year’s offer of 100 free user licenses from proprietary BI software vendor MicroStrategy. As MicroStrategy related at the time, it was an effort to gain a foothold into nonstandardized departmental and line of business BI startup efforts to complement their existing strength at the higher end of the enterprise BI market.
“I thought about that value proposition,” Schmidt says. “I was looking at MicroStrategy back in 1994, so it’s a third generation tested and true product that’s had its tires kicked over and over. How do you not think twice about that if you’re a small or medium-sized company? You’re growing into a product that billion-dollar companies are running.”
Hammond agrees that the COTS market is moving downstream in multiple sectors of app servers, databases and operating systems one system at a time as open source continues to move upstream. If the outcome is not yet clear, it’s a smart thing for consumers to watch, because, “at a minimum, it gives you something to compare to.” But as always, examine your key requirements against what’s “free” and what comes at incremental cost.
Eyes on a Prize
What open source might have done best is open the playing field for business intelligence, analytics and integration to organizations that never had the budget to pursue it otherwise. Better yet, it has taken a collaborative community approach to enterprise technology and opened a public sandbox of experimentation for smaller organizations, departments and lines of business with a like mindset.
Commercialized and hybrid open source is not a zero-sum game with a predetermined winner or loser. What divergent business models – those that take a different path to the same end – have always delivered is a way to level the playing field. This has happened over and over, from electronic data interchange to Web applications to your telephone service. In all those categories where incumbents were challenged, they answered with price cuts and lower-end products.
It’s going to be the same for open source, so as you kick the tires, be mindful of your immediate needs (where timing might dictate your choice) and watch for potential responses from incumbent BI vendors. Throughout the process, with the right mindset and understood goals, all ends of the commercial software spectrum should continue to improve the risk and value proposition for business intelligence.
A (Very) Condensed History
Many people look at open source software as a new evolution. But trace the roughly 50- year history of commercial computing to its genesis and you’ll find open source software right there in the big-iron IBM mainframes of the day. All the users of those systems could access the software code and manipulate it to the extent of their ability. Open source helped the ARPANET become the Internet with the first applications for exchanging comments and managing email in the academic community.
The arrival of proprietary pre-customized operating systems, databases and applications for the enterprise and eventually the Internet greased the rails for license-based commercial off the shelf (COTS) software. These products, also known as “shrinkwrapped” software, meant you were buying a lot of prebuilt functionality in a box of floppy drives or disks.
For business, complex back-end platforms of COTS were built as a whole to be deployed across departments and divisions. Expensive R&D fell to vendors rather than customers to ensure everything would work. For corporate networks and the Internet, one-off applications gave way to mass standardization. In these cases, users could license but not access or manipulate the underlying code of most proprietary software.
For mass production and desktops, closed, proprietary software made a lot of sense. Lotus would always understand Lotus, Office would always understand Office.
Unlike freely shared software, customers would buy a license for proprietary software up front – often for each individual user – and would pay annual maintenance fees, and in many cases pay still more for large upgrades of the product. And unlike hardware, software comes with virtually no cost of replication beyond the media it is stored on. The boom of enterprise and personal computing made mountains of money for many companies, most visibly Microsoft, and killed off many worthy but less-effective competitors.
Open source software, as we also know, never went away. It thrives today in the hardware ambitions of IBM or Oracle in the form of the Linux operating system or MySQL database, among many examples.
But a divergence arrived in 1989 with Cygnus, believed to be the first commercial open source software provider. Cygnus offered paid support for open source software already in the market, and tripped off a series of commercial undertakings that first relied on services, and later, an optional paid subscription model that added functionality beyond free “open core” code. Cygnus would be consumed by Red Hat, but it spawned a host of commercial value-add open source providers that have overcome initial resistance to become low-cost, specialized alternatives to traditional COTS.
Observers now remark that commercial open source software vendors are starting to resemble traditional COTS vendors by roping off more and more functionality for paid subscription providers, albeit without the upfront license cost. Equate the subscription cost – with pricing models that might be per user or per server instance – with the maintenance of traditional COTS, they say, and it’s a wash, or a different equation to consider.
Some organizations have a social agenda to keep software free. The GNU and GNU/Linux projects have a policy to “copyleft” software to keep it from becoming proprietary. With required free access and through sharing, such projects have separately spawned any number of communities dedicated to operating systems databases and applications – including business intelligence. –JE