JAN 2, 2007 1:00am ET

Related Links

When Fast is Not Enough
July 18, 2008
TopQuadrant Software Imports Email MetaData into Semantic Applications
March 26, 2008
An Open Challenge to the Open Source Community
November 30, 2007

Web Seminars

6 Key Things to Fast Track your Mobility Strategy
February 23, 2012
Why Getting Started in MDM Doesn't Have to Be Difficult
February 29, 2012
Dashboards: How's Business? Ask your Data!
March 15, 2012

Information Scarcity to Information Overload

Print
Reprints
Email

The focus in enterprise content management (ECM) is shifting from ending information scarcity to dealing with information overload. This dynamic explains why the disparate technologies of search, records management and analytics are now hot.

In the past, different departments managed different media types; they used separate systems and specialized equipment to perform these tasks. The underwriting department within an insurance company used an imaging/workflow system to manage the underwriting process; the regulatory affairs department within a pharmaceutical company made sure that a document management system controlled submissions to the FDA; marketing enlisted a Web content management system to manage the corporate Web site and so on (see Figure 1).


Figure 1: Early ECM Silos

However, these media-centric silos are starting to topple. Repositories are starting to manage multiple content types and enterprises are taking a more lifecycle-centric view of the problem.

This viewpoint shift is altering how both enterprises and vendors approach ECM. A stream approach is taking over, where multiple systems create and manage a content stream, a way of working enabled by standards such as XML, RSS, UIMA and JSR-170. ECM applications are increasingly being subsumed into the system infrastructure.

In this world, content management solutions perform seven main processes:

Creation: The creation technologies that were the first wave of content management applications.

  1. Storage: Applications that store and manage active, unstructured content.
  2. Distribution: These systems push content to users and other applications, thereby making content pervasive and available on a wide variety of devices.
  3. Discovery: These applications help workers find relevant content by letting them query repositories or drill down into a hierarchy.
  4. Archiving: These systems store inactive content in business-friendly ways so it can be retrieved later.
  5. Analytics: These solutions serve as a feedback loop for improving content creation, distribution and discovery.
  6. Management: Management is evolving from an application that manages a specific media type - e.g., documents, Web pages - to a service that supports the entire content lifecycle.

The first six make up major product categories. The last one, management, is a foundation the others rest on (see Figure 2).


Figure 2: ECM Process Framework

The second shift talks about the consequence of generating so much digital content over the past 30 years. In the 1950s and 1960s, secretaries typed only important documents - it was too expensive and time-consuming to type ephemera. However, with Microsoft Word on almost every desktop, employees now type up memos as a matter of course, documenting important subjects as well as trivia.

In short, we've moved from an environment of information scarcity to one of information overload. This has had a profound impact on what content management problems enterprises now need to solve. It is driving them to focus on the last three stages in the ECM process framework: discovery, archiving and analytics (see Figure 3).


Figure 3: Information Scarcity vs. Information Overload

Discovery

The technologies that aid in information discovery - search and categorization - are undergoing a rapid transformation. Historically, Web search, enterprise search and desktop search have been stovepiped, facing different enablers and challenges, and sold by different vendors (see Figure 4).


Figure 4: Search Categories

Users are beginning to question why they need to switch applications to search the Web, their company site or their desktop - when all they want to do is to find the relevant information no matter where it resides. Accordingly, vendors such as Google, FAST and X1 are starting to offer universal search.

Search is no longer about looking for unstructured content. In June 2006, Google announced its new Google OneBox for Enterprise feature of the Google Search Appliance, which enables employees to search for information stored in operational systems, such as purchase orders within an ERP system. Meanwhile, other search companies such as Verity (now owned by Autonomy) and FAST Search have been integrating search with operational and BI applications for years.

Categorization is another technology that enables users to find what they need. Companies such as Autonomy, Endeca, InQuira and Recommind use automated categorization techniques to assign categories or topics to documents, thereby helping workers zero in on a set of related documents. It's also becoming easier for users to manually categorize, or tag, the documents. Popularized by sites such as Flickr and del.icio.us, social tagging lets workers do the categorization. Because the resulting folksonomies reflect how users view the content, the structures morph over time as workers' views of the business change. This is in contrast to official taxonomies that can become divorced from reality if editors do not continually update them.

Discovery technologies are evolving rapidly, and it behooves businesses to stay abreast of the latest capabilities. Otherwise, they are needlessly sentencing their workers to a lot of hard labor in searching.

Archiving

Archiving, the act of making active content inactive, is another rapidly evolving area. Part of the reason for this is the explosion in the number of records that companies must decide to keep or discard. Thirty years ago it was easy - businesses could say, "Keep those 127 file cabinets." Today, all of those official memos and contracts have changed from paper to digital format and reside all over: on PCs, laptops, servers, USB keys and so on. Conversations never recorded years ago now turn up in emails, instant messages, blogs and wikis. This set of messages and documents is intermixed, making it difficult for companies to separate the official from the ephemeral. For example, a company may retain emails for two years, but contracts for seven. This means that if a contract is emailed, the archiving system must store the contract and the email separately so the contract does not vanish when the system destroys the email transaction.

Advertisement

Comments (0)

Be the first to comment on this post using the section below.

Add Your Comments:
You must be registered to post a comment.
Not Registered?
You must be registered to post a comment. Click here to register.
Already registered? Log in here
Please note you must now log in with your email address and password.
Twitter
Facebook
LinkedIn
Login  |  My Account  |  White Papers  |  Web Seminars  |  Events |  Newsletters |  eBooks
FOLLOW US
Please note you must now log in with your email address and password.