The potential for highly effective collaboration depends upon the ability to exchange relevant content rapidly and at a relatively low cost. Portals have made significant inroads in this area by providing frameworks for delivering content through personalized distribution, access controls and single points of access to a variety of corporate data sources. Portlets, or small, targeted programs designed to operate within a portal, are a common method for tying disparate systems together in a single portal interface. No one can argue with the success of portals as they exist today, and we should not expect the portlet model to change significantly. The real changes in the world of collaboration and content management will be well below the interface level of portals in the architecture of the services that provide the content for the user.

The first significant change is already widely acknowledged ­ the rise of Web services in distributed system design. The basic idea is that distributed applications use standard mechanisms for common interactions. For example, the simple object access protocol (SOAP) is used to invoke services and return results; the Web services definition language (WSDL) is used to describe services; and the universal description, discovery and integration (UDDI) protocol provides the means to publish, find and bind to different services. Together, these standards allow developers to package services and make them widely available.

Web service wrappers can be used with existing systems to open access to content sources. Relational and XML databases, content management systems, document management systems and other repositories could all be accessed in a standardized way. Any reasonable portal search tool can crawl file systems, databases and just about any place we can store content. Why should we need to change the way we access and index content? That brings us to the second major change, the emergence of standards for the open exchange of content meta data.

The Dublin Core is a meta data standard with the goal of creating intelligent information discovery systems. It includes commonly used elements to describe a resource such as: title, creator, subject, description, publisher, contributor, date, type, format, uniform resource identifier (URI), source, language, coverage and rights. The Dublin Core Metadata Initiative (DCMI) has recommendations or working drafts specifying how to describe the Dublin Core in HTML and resource descriptor format (RDF). A business special-interest group was recently formed within the DCMI to address the use of the Dublin Core in the commercial sector.

The advantages of including HTML or XML embedded meta data in Web resources is obvious ­ it provides extensive and precise descriptions of content that can be used for more effective searching and navigating. Search engines often use meta data tags, especially the description field, to more accurately classify content found on the Web. From a collaboration perspective, portal and content management applications can use the meta data to better identify relevant information for a particular user. More importantly, it opens the doors to better management of distributed content repositories.

Many organizations use enterprise-wide search engines to crawl their entire intranet and index internal content. This works well in many cases, given the practical limits of keyword indexing and statistical pattern matching. This model of crawling and indexing stops at the corporate borders, and therein lies the opportunity for a new model of resource discovery based upon Web services and meta data. Rather than crawling, resource discovery can move to harvesting. Harvesting gathers meta data about content rather than content itself. The Open Archives Initiative (OAI) Protocol for Metadata Harvesting is one example of a harvesting protocol. The OAI protocol is based on HTTP and has been adopted for digital library, museum and other scholarly projects. OAI adopters face challenges in organizations engaged in business-to-business collaboration: a number of distinct organizations control access to valuable content, the content is managed on a range of decentralized platforms and users need a mechanism for discovering particular types of content. The OAI Protocol for Metadata Harvesting is one method of addressing this; harvesting based on a Web- services model is another.

The evolution of collaboration and content management will require the ability to effectively discover and access content across corporate boundaries. Businesses are not likely to open their firewalls to partners who want to crawl their file systems and databases; therefore, a better model is needed. The model must allow owners of content to control how it is published, and it must provide the means for potential users to discover resources. Content meta data published through Web services is the next step in the evolution of collaborative systems.

For more on the use of unstructured content in collaborative and decision support systems, see Web Farming for the Data Warehouse (Hackathorn, 1999) and Document Warehousing and Text Mining (Sullivan, 2001). Architecting Web Services (Oellermann, 2001) provides an extensive overview of Web services. The Open Archives Initiative promotes interoperability standards for content.

Register or login for access to this item and much more

All Information Management content is archived after seven days.

Community members receive:
  • All recent and archived articles
  • Conference offers and updates
  • A full menu of enewsletter options
  • Web seminars, white papers, ebooks

Don't have an account? Register for Free Unlimited Access