Content never stops growing. Research firm IDC estimates that by 2011, enterprises will be responsible for 1,530 exabytes of digital information - a 10-fold increase in only five years.1 As enterprises gain both more content and additional kinds of content, they also face significant fragmentation. According to a 2005 study by Forrester Research, 78 percent of companies have more than one content repository, while 43 percent have more than six.2 This number has probably jumped significantly since then, driven by new forms of digital content as well as corporate mergers and acquisitions.  

Providing easy access to all this content becomes more challenging every year, but it’s essential for business success.IDC found that the typical information worker spends close to 25 percent of their time searching for and accessing content – and can waste up to three and a half hours per week looking for information that is never found.3 Meanwhile, the Web has led both information workers and consumers to demand seamless communication and collaboration – forcing the enterprise to support real-time sharing of content and ideas.In this new environment, deploying a comprehensive content integration system that permits enterprise-wide searching and content access is a must.


A variety of different solutions have emerged to address these needs, from basic consolidation (migrating all content to a single enterprise content management, or ECM, system) to a “hub-and-spoke” model that connects repositories based on different ECMs through common connectors. The most flexible and powerful strategy for today’s enterprises, however, is the new content mashup” approach.


Hobbled by Hub-and-Spoke


Although consolidation offers some advantages – such as the ability to easily search for content through a single interface and the need to administrate only one system – it is ultimately far too limiting for large enterprises. No matter what companies claim, in practice ECMs rarely handle all kinds of content well, which creates problems when working with unusual files. Combining existing repositories is also time-consuming and expensive. It’s never-ending, because new acquisitions require further migrations. Finally, consolidation makes organizations heavily dependent on a single vendor, which could cause problems if the industry transitions to a new platform.


The hub-and-spoke solution was proposed as an alternative to consolidation. In this model, standard connectors are used to connect disparate repositories to a master content integration platform. Users can then access content through a single interface, such as an enterprise portal. Hub-and-spoke should allow different divisions or geographies to maintain separate content repositories, while providing users with the benefits of a single sign-on and unified search.


Unfortunately, hub-and-spoke has fallen short of expectations in many respects. While the Java-based JSR-170 and JSR-283 standards were developed to encourage the creation of widely accepted, off-the-shelf connectors, they have not been broadly adopted - forcing companies to custom-build their own. Applications from alternate platforms such as .NET or C++ also won’t work effectively with either standard, which is a particular concern for the many companies using Microsoft Office SharePoint.The basic architecture of the hub-and-spoke model raises further concerns as well because users search and access content through the master content integration platform, it can quickly become a performance bottleneck, particularly given the swift pace of enterprise content growth.


Maximize Content with Mashups


The content mashup strategy addresses many of the concerns raised by hub-and-spoke.Like Web application mashups, the content mashup brings together content from a variety of sources so they can be used through a single comprehensive tool. It accomplishes this task with an service-oriented architecture (SOA) approach that takes cues from peer-to-peer(P2P) file sharing networks and Web 2.0 sites like Flickr, whose database of images and open protocols have inspired hundreds of applications.


Content mashups begin by automatically generating metadata or other identifiers (such as thumbnails or summaries) from content. This metadata can then be used to create a consolidated index for extremely fast universal searching and browsing. Meanwhile, a set of application programming interfaces (APIs) can be used to transfer files, edit content or create applications that leverage this content. The result is a virtual content integration that links disparate repositories – even those based on different technologies – without forcing them onto a single platform.


To better understand the benefits of a content mashup solution, let’s consider how it might work at a multinational company with strong information sharing and collaboration needs. Consider, for example, a massive advertising conglomerate.With a broad range of subsidiaries specializing in a variety of disciplines, this company would have numerous content repositories running different ECMs around the world. Yet sharing information across both geographic and institutional borders would provide the company with a significant competitive advantage.


A content mashup would let information workers seamlessly share content organization-wide and allow IT staff to also store content locally using the best ECM system for their needs. Workers could use the consolidated index to find content, achieving many of the benefits of centralization, such as a single log-on, minimal training and comprehensive search. Workers would simply submit a request through the directory to download. Software agents could then negotiate a peer-to-peer transfer directly from the original repository and ensure rapid delivery and efficient use of network resources even for very large files such as high-resolution video.


This approach delivers the benefits of centralization without its limitations. Because content remains in its original repositories, no single system or server will create bottlenecks. Meanwhile, multiple standards and ECM systems are accommodated so the enterprise will not be tied to any one platform and new repositories and systems can easily be added.Content mashups can also work well with standards other than Java, so C++ and .NET applications will be supported. Their flexibility allows companies to quickly deploy new applications and support the broad variety of devices and networks that are standard for today’s mobile users.


Taking Stock


Content mashups are still an emerging technology, and enterprises have only just begun to adopt them. But given their strengths, they are likely to play an increasingly important role in content integration initiatives going forward.

Can you start adopting content mashups for your enterprise? The first step is to take an inventory of your content assets, which includes asking:

  • How many repositories are within your enterprise?
  • What technologies are repositories are running on?
  • What ECMs or other content management systems are already in place
  • Which repositories need to be accessed across the enterprise?
  • Do you have an ongoing need for all repositories or if some can some be retired?

For companies with a relatively small number of repositories and fairly consistent types of content, consolidating them onto a single ECM system may still be a viable option. However, it’s important to consider future needs as well.If future acquisitions or a geographic expansion is planned, consolidation may not be the right solution.


Most large or rapidly growing companies will find they need more flexibility. In some limited instances, a hub-and-spoke model could be sufficient. An example, might be companies that have a relatively small pool of Java-based assets. Overall, the content mashup approach is currently the best way to accommodate the diverse assets and technologies seen in the contemporary enterprise.


The added flexibility of content mashups will allow organizations to use and reuse content across boundaries, increase the efficiency of their information workers and IT staff and remain nimble enough to adopt new repositories and technologies. Adopting a comprehensive yet highly adaptable content integration strategy has become a critical business imperative.




1. John F. Gantz. “The Diverse and Exploding Universe.” IDC, 2008.

2. Doug Henschen. “In Focus: Oracle the Content Integration Fray.” Intelligent Enterprise, 2005.

3. Marydee Ojala.“Searching for Efficiency.” EContent Magazine, 2007.

Register or login for access to this item and much more

All Information Management content is archived after seven days.

Community members receive:
  • All recent and archived articles
  • Conference offers and updates
  • A full menu of enewsletter options
  • Web seminars, white papers, ebooks

Don't have an account? Register for Free Unlimited Access