A data warehouse presents, in data, a picture of an enterprise such as a corporation, university or government agency. The picture presented is only as current as the data. If the data is stale, then the data warehouse is out of date. If, however, the data is fresh, then the data warehouse can be current and up to date. This, of course, is mere common sense. Anyone in the data warehousing field knows that data has a shelf life, a figurative expiration date, after which data and the data warehouse surrounding it begin to go stale.


What is that expiration date? How soon does data go stale? These questions and the responses to them become increasingly relevant as data and data warehousing grow and become competitive weapons. These questions, taken to their ultimate conclusion, lead to real-time data warehousing. Advances in technology render real-time data warehousing as a more viable option and are causing organizations to consider the possibility more closely.


This three-part series addresses four key questions, which must be answered as the decision-makers in an organization consider the possibility of real-time data warehousing:

  • What is real time?
  • When and where is real time valuable? Real time data warehousing is expensive; when is it worth the investment?
  • How is real time achieved?
  • Trouble Shooting in real time. Now that you have a real-time data warehouse, what causes the biggest headaches, and what should you do about it?

What is Real Time?


Interactive Applications


Real-time applications have been around for years. Customer Information Control System (CICS) applications give a mainframe the ability to respond interactively to the data and actions of a user. More recently, Web-based applications give a server the ability to do the same. As such, interactive functionality has been the alternative to batch processing, as well as the defacto definition of real time. An application, therefore, could either run as a batch job or as a set of interactive responses.


This perception associates interactive applications with real-time applications. A data warehouse is not interactive in the way a Web-based online ordering application is interactive. But, a data warehouse can receive and integrate data at the same pace as a Web-based online ordering application. This leads to the first concept of real-time data warehousing.


Data at the Speed of Technology


In any discussion of real-time data, the first topic of discussion is the distinction between real time and near real time. This is an academic splitting of hairs over the fact that no architecture (including copper wire, optic fibers and brain synapses) can truly register an event in place B the exact instant it occurs in place A. So, having recognized the discussion of real time versus near real time to be a semantic exercise, let us move on to the topic of data at the speed of technology.


Data at the speed of technology is the concept of real time, which means that an event in place B is registered in place A at the earliest possible moment technology will allow. As technology advances, the time gap between the event in place A and its register in place B shortens.


Data at the speed of technology is both difficult and expensive. The investment in real- time architectures and infrastructures causes some organizations to avoid real time. In such cases, the cost of technology and application development limits the available options. This is unfortunate, because real-time does not need to be so instantaneous.


Data at the Speed of Business


An alternative to data at the speed of technology is data at the speed of business. To choose to deliver data to the business instantaneously because available technology can do so is allowing technology to drive the business. While this method may seem fun and have a high cool factor, they fade once the business realizes the cost of a technology they don’t really want or need. That’s when the business begins to drive the technology - when data moves at the speed of business, rather than the speed of technology.


A business can only absorb data so quickly. For example, many people use an online trading service for stocks and bonds. Online stock and bond trading services include a real-time pricing for selected stocks and bonds (i.e., a ticker tape of the current stock price). If an online trading service displayed the price of each and every individual transaction as a separate price point for a frequently traded stock, the flood of price points would overwhelm a person. Instead, an online stock trading service provides updates on a frequency that can be absorbed by a business person.


With data at the speed of business, data is delivered to the business at the speed at which the business can absorb the data, integrating new data points with other data points to derive information, knowledge and possibly wisdom. Is this data instantaneous? No. But, when the business cannot absorb a continuous flow of data at the speed of technology, then such a continuous data flow becomes noise rather than data. It’s similar to walking into a fourth grade classroom and asking the children to describe their favorite car. The flood of information would quickly become noise, informational but overwhelming.


Having defined the two forms of real time, it is important to answer the question, “When and where is real time valuable?”



When and Where is Real Time Valuable?


When asking about when real time is worth the investment, the decision to integrate real-time data into a data warehouse is a business decision; it’s not a technology decision and not an information system decision. Rather, the decision to integrate data on a real-time basis is a question of the ROI of the real-time data. So, we must first turn to the business unit which would use the real-time data and inquire about what its plans for real-time data are. For example, consider Figure 1, which presents a series of data points in a graph.



Next, consider Figure 2, which presents the same series of data points, with the next data point (68,280) in the series.



What business decision would be influenced by the inclusion of this new data point (68,280), or the next data point (68,801)? These figures are the distinction between strategic business decisions and tactical business decisions. Strategic business decisions have long horizons (e.g., days, years and possibly decades). A person making a strategic business decision would look at the long history of data points. A single new data point does not change the long history of data points, and therefore does not change a strategic business decision. Strategic decision-makers, typically middle and upper management, do not benefit from real-time data. If the audience of real-time data is middle or upper management, then the lack of ROI indicates that real-time data is not necessary.


Tactical business decisions have short and immediate horizons (e.g., minutes, hours or possibly a day) however. A person making a tactical business decision looks only at data which is directly relevant to business operations occurring right here and now. A single new data point may mean that something has occurred, or not occurred, which is of immediate concern to the operations of the enterprise. Tactical decision-makers, typically front line workers and managers, can benefit from real-time data. If the audience of real-time data is front line workers or managers, then the possibility of ROI indicates that real-time data may benefit the enterprise.


Real-time data, therefore, achieves its best ROI in the context of tactical business decisions by helping those on the frontline who make those decisions. This conclusion, of course, does not prevent someone from the echelons of middle or upper management from deciding that the cool factor is both the justification and ROI for real-time data; in that context, the logic and reasoning in this article are rendered irrelevant. This conclusion does, however, provide a business-oriented framework within which to propose that an enterprise should or should not include real-time data. Such a business-oriented framework increases the probability that a proposal regarding real-time data will be understood and appreciated by the powers that be in an enterprise.


Having defined the two forms of real time and the scenario wherein real time achieves the highest ROI, the next article in this series will answer the question, “How is real time achieved?”

Register or login for access to this item and much more

All Information Management content is archived after seven days.

Community members receive:
  • All recent and archived articles
  • Conference offers and updates
  • A full menu of enewsletter options
  • Web seminars, white papers, ebooks

Don't have an account? Register for Free Unlimited Access