Remember when you used to get canceled checks in a in a thick envelope from the bank along with your monthly statement? That doesn't happen much anymore. Today, many banks scan those canceled checks and make them available for online viewing. Additionally, they're considered by federal banking laws to be "original documents," meaning they are "original forms with signatures" and can be used for legal purposes such as transaction and/or payment disputes with merchants.

I'm not trying to give you a banking lesson; my purpose is to provide a picture of what is happening with IT on a global basis. During the last five years, the percentage of unstructured content (such as the scanned images of the canceled checks) in enterprise information systems has been rising like a rocket.

In fact, in many companies unstructured content is growing faster than traditional data-based content. Examples of this unstructured enterprise content (EC) include Web content, scanned document images, audio and video files, broadcasts and diagrams - just to name a few. It's unstructured in that it doesn't reside in rows, columns and fields somewhere in a corporate database.

What's driving this increase in EC across the company? It's the age-old human requirement to know more. Many products and services now come bundled with content (such as scanned documents). For companies, that content is seen as a key factor in helping to enhance business processes.

The increasing importance of electronic channels is also a key to the rise in EC. With increased governmental and regulatory scrutiny, there are mandates to keep records of Internet and email transactions. Demand for richer media such as audio files and HD video for training, marketing, etc. also creates EC.

Thus, I believe there's a critical need for companies to create so-called "semantic networks" that help enable users to search EC to gain access to needed information. These semantic networks are created over time and can contain a library of terms that can be expanded as the network grows, based on EC that is created, collected and stored by the company. They can offer users the ability to perform phonetic and contextual searches.

To derive full use from semantic networks, however, it is essential to combine the ability to analyze EC with business intelligence (BI) efforts to help facilitate workflow and collaboration and to provide a more enhanced view of the company. Let's look at financial analysis as an example. In a typical scenario, an analyst might generate a few reports from the company's BI suite to look at current numbers and, perhaps, provide some forecasting based on trend analysis.

However, with the ability to search and examine the company's EC library, the analyst would also need access to digitized images of past financial statements, emails containing details of events as they transpired and other documents that could help provide a much more comprehensive picture of the current situation - and how the company arrived at that state. This kind of holistic analysis capability can help provide tremendous insight unavailable solely through analysis of structured data.

With the insight that combining EC with BI capabilities can help provide, I believe enterprise content management (ECM) is probably one of the most critical initiatives that many companies will undertake during the next decade. ECM covers the entire EC life cycle, whether dealing with physical records management or electronic documents and content. Thus, ECM covers management of all phases of the EC life cycle as illustrated in Figure 1:

  • Create
  • Manage
  • Assemble
  • Deliver

 Figure 1: The EC Life Cycle

Further, if current trends are any indication, the amount of EC resident in corporate repositories will grow at a much faster rate than it is presently, making it even more critical to use and manage it effectively.

How do you manage EC effectively? Begin at the top, at the executive level, and manage all information - including EC - from a strategic point of view.

Treat EC as a corporate asset, just like any other. Because EC can contain vital information that can be used to help gauge and manage business performance, it is critical to treat it like the valuable asset it is. It is crucial that you know how EC facilitates core business strategy within your information management environment. EC needs to be managed strategically so that it is reliable, available and usable enterprise-wide.

Understand the EC life cycle and use it to your advantage. It is absolutely critical to focus on managing EC across its life cycle so that the knowledge it provides can be used to help manage and grow the company. It is also critical to standardize how EC is handled through each phase of its life cycle enterprise-wide.

Practice information life cycle management - not just data management on an enterprise-wide level. Information life cycle management includes both ECM and traditional data management (DM). Traditionally, EC has been stored in separate repositories from structured content, such as transactional data. This "siloed" approach must change if EC and BI are to be linked to provide a comprehensive view of the company. All corporate information - structured and unstructured - must be managed on a smooth continuum.

Create an enterprise taxonomy and metadata vocabulary. This will help enable the standardization of terms and definitions associated with EC and structured content, and it will give users a common frame of reference and a "single version of the truth" about information - whether that information takes the form of reports generated from a database or unstructured content.

Choose ECM toolsets and methodologies that integrate with your existing BI architecture. Fundamentally, an ECM methodology should involve a staged implementation approach that covers the entire EC life cycle and helps provide short-term benefits, while laying the foundation for a robust service and information management toolset. Additionally, an ECM toolset should have as its number one priority the ability to couple with your existing BI infrastructure to help provide users with a coherent picture of business events and performance. The toolset should also have text mining capabilities, including:

  • The ability to extract text from multiple sources (such as .doc, PDF, .wmf, etc.).
  • The capacity to mine text from documents in multiple languages.
  • Semantic sophistication to help enable the creation of the semantic network previously discussed.
  • Scalability to grow as your documents and document types grow and change.

The next few years are going to provide a bumpy ride for the IT function at many companies. The rise of unstructured content will require new ways of thinking about how information is used and managed enterprise-wide - indeed it will require a new definition of the term "information." The goal is simple: treat all information - unstructured and structured - as if it is one of the most valuable assets your company possesses. Learn all you can about it, manage it properly, and use it to help grow the company.

Register or login for access to this item and much more

All Information Management content is archived after seven days.

Community members receive:
  • All recent and archived articles
  • Conference offers and updates
  • A full menu of enewsletter options
  • Web seminars, white papers, ebooks

Don't have an account? Register for Free Unlimited Access