Editor's Note: This column is Part 1 of a two-part series focusing on the characteristics of enterprise portals. Click here to read "Real- Time Data Warehouse, Critical Characteristics of Enterprise Portals, Part 2."

An enterprise portal, by itself, provides no intrinsic value to an organization. The value is based entirely on the content presented and the processes supported by the portal. An enterprise portal complements and adds value to your data warehouse, document management, workgroup collaboration and person-to-person communication investments.

Enterprise portals will become a pivotal tool of information architecture which is the framework for managing the meaning, organization and flow of information throughout an enterprise. An enterprise portal provides enhanced meaning by organizing content in context and by supporting collaborative flows of information between communities of common interest. It helps convert raw data into information and then validates that information to produce knowledge.

Volumes of work have been devoted to defining the terms "data," "information" and "knowledge." For the purpose of this column, I'll offer these definitions:

Data is a set of observable, measurable or calculable attributes. Data is isolated facts or observations. Alone and in the abstract, data does not provide us with information.

Information is data in context. It is raw material plus conceptual commitments and interpretations. To create information, data must be extracted, filtered or formatted for presentation in some specific manner. Correlated sets of data produce greater meaning.

Knowledge is a subset of information that has been subjected to and has passed validation tests. Tests of validity produce information that allows you to act. The essence of knowledge is that it is conclusive.

Data is often differentiated into structured and unstructured types. Structured data is organized in a formal way (tables, columns, files and fields) to produce a database. Unstructured data is everything else including text, graphics, sound, images and combinations as found on the Internet.

This distinction has always irritated me because it only serves to segregate domains that have historically been managed in radically different ways. The disciplines, and the disciples, have diverged to such an extent that they have no common language or tools. People who deal with structured data are "data people" while people who deal with unstructured data are "information people" or "knowledge people." Bah, humbug!

This artificial perspective has carried over into the realm of enterprise portals. The vendors fight about the very term to preserve the distinction. "Enterprise information portals" come from data vendors, and "corporate portals" or "knowledge portals" come from the other guys. Their products do share one thing in common - they are incomplete. The most essential characteristic of a true enterprise portal solution is that it manages all forms of content in a more uniform and integrated manner.

Instead, I see a continuum of content that is differentiated in two more meaningful ways. The first is the definition of information as data in context and knowledge as validated information. The second is the distinction between raw and refined material. Raw material is an input and refined material is an output of an analytic or cognitive process. It is okay to think of data as raw material and information as refined material, but this is too simplistic.

Anything can be raw material if it is collected by someone to initiate a decision-making or discovery process. The process of viewing, using, investigating, manipulating and/or reaching a conclusion produces refined material. This is sometimes referred to as refined results to drive home the output connotation. What is raw and what is refined is based on the eye of the beholder and the usage of the content.

The most basic role of an enterprise portal is to be a delivery vehicle for diverse content. A more powerful role is to help manage the flow of the information supply chain1 of increasingly more refined results. The ultimate role is for the enterprise portal to allow the organization to provide context to generate and share information as well as correlation processes to produce validated knowledge.

I will define nine critical characteristics, or sets of features, of a complete enterprise portal solution. These characteristics push the definition of an enterprise portal way beyond the "table of contents with a search engine" functionality of Internet portals such as Yahoo!.

Content - Stuff that is Organized for Access and Assimilation

Content includes every type of data and information source that is used within an organization. This must encompass the whole gamut of data types such as organized databases (plain text to documents to Web pages), all AV sources (sound, video and animations) and links to content stored elsewhere. Messages in the form of e-mail, notes, forums, news and chats are a natural component of an Internet-based environment.

The portal must provide assessors or viewers, at a minimum display, the content and ideally allow editing or manipulation of the material. This topic is covered later in both the connectivity and channel categories.

In addition, content includes organized methods to present and deliver sets of material as a unit. These are critical forms of refined results that allow the publisher to group content that is interrelated or makes a specific case. These include reports, findings and packages - all of which are terms that are given a unique meaning here. Reports are formatted presentations of material in the form of a document that may embed text, charts, diagrams and tables. Findings are the results of analysis or studies that incorporate the indicative data points, the conclusion and the rationale behind the results. Packages are sets of dissimilar data and information objects that are defined by the publisher as a single unit for delivery.

Context - That Which Surrounds and Gives Meaning to the Content

Context is the essence of information. If you are given an isolated fact, it has no meaning. When you are told that the answer is 42, what do you know? Precisely nothing. However, when you learn that 42 is the "answer to the secret of life, the universe and everything," you now have information 2. Not terribly useful information, but information nonetheless.

A portal defines context by embedding, linking or packaging related content together and by ubiquitous meta data and reference data. Embedding places content directly on a Web page. Linking is one of the defining features of the hypertext-based Internet and provides context by connecting to associated material. This can help clarify a point, explain a result or support a position.

One of the most powerful features of a portal context setting is immediate and intuitive accessibility of meta data. Definitions, descriptions and examples can be one click away. Likewise, reference data can be made available to explore related topics in depth.

Context is also revealed by learning what the popular topics are (top 10 hits, for example) and discovering who else has read or viewed the same content. The fact that Ralph Kimball's book, Data Warehouse Toolkit, is the number one data warehouse bestseller tells you more than an isolated review or testimonial. Learning that many other database analysts have read a particular article is likely to convince you to read it, too. Without this contextual information, you may have a harder time deciding where to begin your personal research on data warehousing.

Connectivity - How the Portal Gains Access to the Content

A portal cannot serve the needs of information consumers if it can't access specific content they are interested in. Built-in capabilities, wrappers, application programming interfaces or external functionality can all provide connectivity. A superior portal has an extensible architecture allowing for a diverse blend of connection methods.

A broad range of built-in capabilities provides more immediate capability outside the box. The native ability to open Microsoft documents or read Lotus Notes databases is preferred over being required to have third-party software installed.

A full-featured portal will support all industry-standard and market-dominant application programming interfaces. APIs exist at the storage level, the application level and the business level. Storage-level APIs include, at least, all of the following: SQL, native DBMS interfaces, the newly evolving OLAP call-level interfaces, text databases and document management system APIs, as well as access to common messaging (e-mail, notes, forums, etc.) environments.

Application level interfaces include access to business intelligence servers, ERP systems, other business applications, knowledge management products and other forms of application servers. Business-level APIs are a more obscure topic. They include invocation of business rule management systems, business-to-business interaction via EDI protocols and industry-specific implementations of XML to enhance semantic standardization of a business process.

Channels - How People Interact with the Portal

Channels are the means made available to information consumers to send and receive content or to interact with the portal. Our definition is based on a communications metaphor for how information is received or transmitted.3

Receiving channels are methods that allow you to scan, fetch, order or subscribe to content. Scanning simply means to browse a set of pages on the portal looking for items of interest. Fetching requires you to go after something intentionally and includes following links and requesting downloads. Ordering is a broad concept that covers all forms of requesting specific content interactively. This includes everything from following a scripted process to guide you to a result to building an ad hoc query from scratch. All query and reporting business intelligence tools are included in the ordering feature set. Subscribing is the act of signing up for periodically released material from a specific source or on a defined topic. The result may be to have material sent directly to you or you may simply be informed when new stuff is available.

Sending channels refer to all the ways that content is transmitted to information consumers. The sending channels include such methods as deliver, publish, share and refer. Delivering means designating specific content to be sent specifically to an individual or group. Publishing is a method of making content available by subscription. The publisher may not know who will receive the material, but he does define the rules for subscription. Action is taken to send the material, or a notice of availability, to the subscribers. Sharing is a method of designating material as being available to a defined population. No further action is taken by the one sharing information. Anyone who passes the qualification criteria can access the material. Accesses may or may not a tracked. Referring is an explicit action taken by one party to alert another to interesting or relevant content. The "refer this article" button on some Web sites is an example of this feature. Methods of referral include sending e- mails with embedded links or posting a link on the individual's private home page on the portal.

In Part 2 of this column, I will describe the remaining critical characteristics of a complete enterprise portal solution which include collaboration, correlation, communities, customization and control.

Part 2 of "Real-Time Data Warehouse: Critical Characteristics of Enterprise Portals," will be published in the February 11 issue of DM Direct, DMReview.com's electronic newsletter.

1. See the column, "Real Time Data Warehouse: Data Warehousing, Enterprise Portals and the Information Supply Chain" by Michael Haisten, published on November 12 on DMReview.com.

2. I apologize for the weak reference to the cult favorite," The Hitchhikers Guide to the Galaxy." This observation serves to define information as data in context through the information itself is inscrutable.

3. This use of the term "channel" is in conflict with the more typical definition that is evolving in the portal space. Many vendors use "channel" to mean what we call a category, which is just a grouping of content. Often the term "channel" takes on a "community of interest" connotation. People with shared interests either subscriber to or access a channel to see relevant content. The industry usage is ambiguous and conflicting. We choose to use the more precise terms described here.

Register or login for access to this item and much more

All Information Management content is archived after seven days.

Community members receive:
  • All recent and archived articles
  • Conference offers and updates
  • A full menu of enewsletter options
  • Web seminars, white papers, ebooks

Don't have an account? Register for Free Unlimited Access