This is one article of a series appearing in this issue categorizing the users of the corporate information factory.

Perhaps the most misunderstood member of the DSS/data warehouse community is that person known as the "explorer." The explorer is the original corporate "out-of-the-box" thinker. The explorer is an individual who does not look at the life and commerce of the corporation in the standard ways. Instead, the explorer looks at corporate business differently than any one else. In some cases, these insights are very valuable; in other cases, these insights are merely a mirage.

Random Queries

The queries submitted and the analysis required by the explorer are of a very random nature. The explorer operates on intuition and observation, trying to find relationships between obscure pieces of data and events. The explorer is often wrong in the conclusions that he/she draws. But on occasion the explorer is correct. On those occasions, the rewards of being correct can have tremendous payback for the corporation, easily paying for the many misses made by the explorer.

Often the explorer finds nothing as a result of the analysis done. Occasionally the explorer finds huge nuggets overlooked by everyone preceding the explorer.

Uninstitutionalized Procedures

The analysis and exploration procedures used by the explorer are not institutionalized by the corporation. The explorer operates in an unstructured world. The explorer may not submit a query for six months and then submit six queries in a day, when the fancy strikes him or her. The explorer operates in a truly heuristic manner.

For a variety of reasons, the queries submitted by the explorer tend to be large. The explorer operates on detail and looks at minute pieces of data to find the desired subtle patterns. The second reason for the girth of the queries submitted is that the explorer requires history. The patterns sought occur infrequently. Therefore, the explorer needs robust amounts of historical information. Finally, a third reason for the size of the explorer queries is that the explorer needs to look at the data being analyzed in a manner unknown to other users. The data used by the explorer needs to be twisted around to suit the mood of the explorer. In short, the queries submitted by the explorer are huge because the characteristics of the query are:

history x detail x ten-way joins

There is then a need for very large queries because the explorer needs all of these characteristics.

Looking for Patterns and Relationships

The explorer looks for patterns and relationships. The explorer cares about the conditions that cause the occurrence of a notable event. Once the conditions surrounding a notable event are established, the explorer can seek predictability. Once predictability is determined, it is relatively easy to then create an environment where there is business advantage.

Explorers create hypotheses out of their analysis. The explorer then passes the hypothesis to the data miner to prove or disprove and to analyze the strength of the hypothesis. Often the explorer will create a repeating query of his or her findings and then pass that query on to the farmer for routine creation.

Comparing the Explorer

The explorer is different from the data miner in that the role of the explorer is to create or suggest hypotheses while the role of the data miner is to prove or disprove those hypotheses. The explorer is different from the farmer in that the farmer operates on a basis of regularity and known queries. The explorer operates in the world of the unknown. The tourist is different from the explorer in that the tourist operates on massive breadth of data but very little depth. The explorer operates on a combination of breadth and depth.

Architectural Considerations

The explorer often needs a world of his own in which to submit his varying queries. Today, there are several technologies available to support the explorer. We can now create an exploration warehouse just for their usage.1 The exploration warehouse takes advantage of specialized databases such as token- or memory-resident technologies to create an environment that permits any and all queries. The response time is reasonable for such large queries. The explorer can change his or her mind as often as needed and not be penalized.

The exploration warehouse is a component of the corporate information factory. It consists of data drawn from the data warehouse that is reformatted into either a token-based database or a memory-resident database. Then the explorer uses a variety of tools that access these technologies to launch queries, receive results, study the results and then launch another query.

The explorer has other needs though that may only be satisfied by using the full data warehouse and possibly near-line stored data. Therefore, his or her access to the data warehouse and archived data becomes an important consideration.

The database design explorers find most useful is a normalized one. Because their queries are "ad hoc" in nature and very unpredictable, this type of database design seems to work best for them. Rarely will explorers use a star schema or other predefined database design.

Summary

The explorer is becoming a significant user of the corporate information factory. Explorers' queries tend to be long, random and may result in nothing of interest. However, occasionally, the explorer obtains results that are of incredible value to the corporation. With the new technologies, it has become significantly easier to satisfy these unusual and demanding business users.

1 See "The Exploration Warehouse" by W. H. Inmon. DM Review. June 1998.

Register or login for access to this item and much more

All Information Management content is archived after seven days.

Community members receive:
  • All recent and archived articles
  • Conference offers and updates
  • A full menu of enewsletter options
  • Web seminars, white papers, ebooks

Don't have an account? Register for Free Unlimited Access