We've heard cries that the enterprise search market is dead, that thefight is over and that it is now a two-horse race between Microsoft and Google. There may be some truth in this, but it's certainly not the whole story. The enterprise search market continues to move forward, albeit with a few major challenges. The downside is that these challenges can make what seems like a simple search product selection and implementation project for the enterprise more complex and costly than expected.

As of mid-2011, the search market is dominated by three big names, Google, Microsoft and Apache Lucene. The Google appliance continues to sell well at the departmental level within organizations. Microsoft, via its dominance with Office and SharePoint, has become the de facto search application for many back-office needs. And Lucene has become ubiquitous as an embedded search engine in many products and applications. With these three very different dominant search engines in the market, it can be hard to make a mark as an independent search vendor today.

We could add a fourth name to list in the form of Autonomy, the U.K. tech giant currently being acquired by HP. Though they remain a formidable force, Autonomy has lately been known for acquiring myriad companies and products linked together via the IDOL platform, and not a major enterprise search vendor.

Compounding this situation is the fact that the enterprise search market can well be described as "slow moving." Though the names change over the years, the underlying technology remains much as it was a decade ago. On the one hand, this is a positive thing because it has delivered mature, scalable, well-tested offerings that generally work well. On the other hand, we have search engines that continue to fall short of end users' expectations (no matter how unrealistic those expectations may be). As has become the norm, expectations are set largely by Internet search experiences via the likes of Google and Bing - a very different search paradigm with few of the necessary restrictions or challenges of enterprise search.

So, in a market where users are skeptical of the results delivered by enterprise search vendors and dismissive of the clunky (non-Internet) user interfaces they navigate, where are things to go?

In the short-term, we expect that small search vendors will continue to build out their own search-based applications. We expect canned search queries and analytical capabilities to better deliver very business-specific needs in industry verticals such as health care, legal and retail. In some sense, search vendors are trying to do what portal vendors previously attempted to do: hide the underlying infrastructure, pre-"can" the query/process and provide a dashboard to deliver specific results. The success of this approach remains to be seen.

Even though the search market moves slowly, other things change rapidly and requirements go into and out of fashion. In 2011 and likely through 2012, we expect that large organizations will continue to view the goal of federated search as a very real pursuit, even though the cost and complexity of such a project often makes it impractical. Most true enterprise search situations today require a search engine to crawl across more than one repository of information, for example, multiple business applications or document management systems. At a high level, this can be done in one of two ways: by using a single engine across all information sources or by using multiple engines and collating results back through a single location.

In the first situation (unsurprisingly favored by search vendors) a single "über" search engine is designated for the entire enterprise, centrally managed and administrated. It connects in one way or another to each and every information store it needs to search. At search time, it searches against all (or a selected sub-section) of the stores, retrieves the results and typically tries to normalize the results into a single search response for the end user.

In the second situation, a central service is also used, but rather than serving as an über search engine, it is a normalizing hub that connects to multiple search engines instead of multiple data stores. It collates multiple result sets and normalizes results for the end user.

At one level, it seems logical to seek a single search engine for every use, and this is an approach followed by vendors such as IBM and Autonomy. But experience in the field tells us that this seldom works as well as proposed. Making direct connections to various information stores is not a lightweight task, despite vendor claims of "out-of-the-box" connectors. Connecting to the actual store is easy enough, but making sense of your particular configuration of the content and structures within that store is a much larger task.

The challenges don't end there. Performance and scalability become issues as more data sources are added, and the task of normalizing results from many different sources (including deduping) can be onerous. Multirepository/federated search is certainly an important component of virtually all the systems we evaluate in our research, but like so many different elements of search technology, getting this to work effectively often requires a host of tradeoffs.

For IT managers faced with updating or replacing existing search technology, I advise you to look at the different options available; it's not just a two or three-horse race. There are many interesting options to explore from vendors including Endeca, Vivisimo, dtSearch and Exalead, each with differing strengths and weaknesses. Whichever option you finally decide upon, if your ultimate goal is to provide a single point for search within your organization, do not underestimate the complexity and potential cost of your project. Searching a single corpus of data may be a commodity process these days, but searching multiple corpora and normalizing the results remains as challenging as ever.

Register or login for access to this item and much more

All Information Management content is archived after seven days.

Community members receive:
  • All recent and archived articles
  • Conference offers and updates
  • A full menu of enewsletter options
  • Web seminars, white papers, ebooks

Don't have an account? Register for Free Unlimited Access