This is the third article in a three-part series on downtime. The end of this article points to an "Annual Cost of Downtime Worksheet" that will help executives and system managers transform the information in this three-part series into a final downtime-impact number.
Our series on understanding downtime ends with this final installment. We have dug fairly deeply into this area of managed availability because it is so important to delineate your degree of financial risk relative to system loss. The downtime number on your ledger sheet and your specific business requirements will almost automatically define your availability needs, allowing you to select a solution that mitigates risk, protects information assets and represents a good ROI.
First, we looked at the theoretical side of downtime, bringing into focus both planned and unplanned issues. Then we reviewed the hard or quantifiable costs of downtime - the tangible costs - from lost or spoiled inventory to late-delivery penalties and liability exposure but, most notably, the salient categories of labor costs for idled workers and specific lost revenue for which we provided step-by-step calculation formulas.
Now we need to examine the intangible costs of downtime. This area can get a little scary. We're dealing with what we hypothesize, not with what we distinctly know. And, yet, we can't ignore intangible costs because they may significantly outweigh the tangible impact. Intangible costs can usually be estimated from various sources. The key question to ask: Have we analyzed where these costs may come from and are they reasonable based on the facts or have we underestimated? Most companies severely underestimate intangible costs or simply exclude them.
A list of intangible downtime costs could almost be endless, subject to the characteristics of individual companies. Some principal intangible costs include:
- Lost opportunity,
- Customer loyalty,
- Damaged reputation,
- Employee morale.
Other costs might include credit ratings, analyst reaction to share value, negative publicity and competitive disadvantage in the market.
Lost opportunity may be the biggest cost associated with downtime. The permanent loss of just one loyal customer and the repetitive instances of profit derived from his or her potential ongoing purchases can be very considerable. Some analysts estimate that every one dollar of immediate lost revenue (tangible) translates into 10 dollars of future lost revenue (intangible). When the permanent loss of one customer is multiplied by a factor of 100 or conceivably 1,000, the cost impact may be simply staggering.
As noted in point 3 of our lost revenue formula for tangible costs ("Determine downtime impact factor"; see part 2 of series), lost opportunity (defecting customers) can be estimated as a percentage that increases the impact-factor percentage or it can be dealt with separately as an intangible element, as we have done in our "Annual Cost of Downtime Worksheet." (See end of article for link to worksheet.)
Quantifying lost opportunity is difficult. But it is important to put a stake in the ground, whichever way you choose to deal with lost opportunity. Remember, nothing prevents you from modifying your work as your knowledge of costs and the estimating process grows.
Companies spend millions attracting customers and making a sale. And they spend even more to keep those customers and continue marketing to them. Witness the rise of CRM, data warehouses and Web-based preference-tracking software. Obviously, there is a strong relationship between "customer loyalty" and "lost opportunity." This relationship is often bridged by "customer service," which has evolved rather dramatically as customers increasingly interface with applications and less with humans.
In the past, a friendly voice could offset the impatience and possible anger a customer might experience on the phone when a system outage blocked a transaction, perhaps influencing the customer to call back when a system would be back online. But, now, in the absence of human contact, it is simple for a potential customer to move to a competitor's offerings with just a simple mouseclick.
Recent studies have shown that the average time between mouseclicks of an Internet user (and shopper) is less than five seconds. Customer loyalty is fluid. Competitors are only a mouseclick away. The impact of not "being there" could be significant, making the degree to which your company offers applications interface versus human interface a serious downtime cost consideration.
We have all experienced the individual who continually criticizes a business for some small, possibly unintentional, act even years after the original incident. It is not hard to imagine the ill will a longer-term system outage might bring about, affecting sales and distribution, partners, suppliers and customers. The impact could be huge and take years to recover from.
Of particular importance, too, is tarnished image with investors and how a downtime event, particularly if it is reported in media, can impact on a company's stock price. Several years ago, when Amazon.com was offline for a number of hours because of a server failure, company stock fell 25 percent in the next day's trading!
What is your reputation worth? It may be difficult to assess the long-term effect of a damaged reputation and its impact on revenue and profitability without investing in surveys and research that mostly paint an attitudinal picture. Even so, stock downturns are quite tangible, as are the marketing man hours and media dollars required to reestablish and polish an organization's profile. In the absence of surveys and research, absolute revenue levels over the 24 months following a downtime event as compared with the revenue levels before the event will serve as a fairly good indicator of the company's success in rectifying its image.
Employees generally want to do a good job and, if the tools necessary to do a good job are not available or are unreliable, employees may begin to think that management doesn't care. Frustration may lead to careless behavior that could cut deeply into productivity. What's more, such behavior has a way of spreading throughout an organization with sometimes alarming speed.
Disgruntled employees - especially the good ones - may leave the organization. As we have seen, lost production and productivity can be quantified. But losing good employees is an expensive proposition, too, and it is compounded by the costs to hire, train and gain equivalent productivity from replacement employees - often estimated at more than a year's salary for each replacement employee! Even at a small percentage of the workforce, the impact to a business could be very substantial.
Factors Sometimes Overlooked
Before we point you to our "Annual Cost of Downtime Worksheet," it is important to review a few other factors that will compound your downtime thinking and calculations. Some of these factors are often overlooked in downtime logic.
System reliability embraces the entire interdependent computing enterprise including: all CPUs in all servers and workforce computing stations, operating systems running all platforms, power supplies, DASD (disk drives) on all servers, database-management systems across the enterprise, all application software, network switching and routing devices, and network connections. These components are all subject to some measure of downtime and as each is interdependent on others, the effect is multiplicative.
Even if a hardware vendor tells you that a CPU is 99 percent reliable, your system will not be 99 percent reliable. The multiple elements of the system will all contribute very nominal increments of unreliability that collectively, according to statistical probability, may contribute to substantial downtime impact.
For example, a system comprising 10 elements, each with 99 percent reliability, would have an overall reliability factor of 90.44 percent (0.9910) and would therefore be expected to be unavailable 9.56 percent of the time. In a 24x7x365 environment, this is almost 838 hours or 35 days of downtime annually!
Downtime Relative to Time of Occurrence
Downtime costs vary in accordance with the time of an outage. Increasingly, we are becoming a 24x7 global economy, but not all businesses employ personnel who work during nighttime hours. A system outage at 3 a.m. may have little impact on such an organization. Even round-the-clock businesses, such as most Web-based e-commerce, have highs and lows in activity throughout a 24-hour period. And, of course, downtime occurring on a weekday may have a dramatically different cost impact than downtime occurring on a weekend or holiday.
Concrete History of Planned Downtime
As previously noted, planned downtime constitutes over 80 percent of all downtime, and the very fact that it is planned means the past record of maintenance, operations and periodic events is a reasonable road map to the future in estimating downtime costs. To get a handle on this and calculate the associated costs, we suggest you define the average time of planned-downtime events and their costs.
To do this, for the past 12 to 24 months, audit all standard routine activities - such as database backups and reorganizations, application upgrades and system maintenance - to get a time average per occurrence of activity. Then multiply this time average per occurrence by the number of times the activity is performed per year, adjusting for organizational growth.
Other planned activities - such operating-system/software/hardware upgrades or disaster recovery testing - may be less routine, but by following the same historical process as above, you can fairly accurately define frequency and duration of the required tasks, establishing averages that play into the downtime cost scheme. Remember to make adjustments that factor in known activities relative to future growth.
Significant Costs in Contractual Obligations
Almost all organizations operate with some form of service level agreement, both internal and external, and will suffer penalties, both direct and inherent, as a result of lost availability or even a reduction in service levels. The direct penalties are typically imposed as a way to insure that the organization utilizes various forms of downtime protection. The inherent penalties are manifested in damaged reputation, compromised employee morale and the costs to recover. Any analysis of potential downtime financial exposure should not ignore this highly important area.
Annual Cost of Downtime Worksheet
Throughout this three-part series, we have presented a lot of information on the various elements that contribute to a bottom-line figure on the cost of downtime. Now it's time to put it all together. But, remember, isolating an individual figure that represents your annual cost of downtime is not a black-and-white process. The unique intricacies in different types of businesses color this process in many shades of gray.
With all this in mind, we present a valuable tool, an "Annual Cost of Downtime Worksheet," which logically consolidates many of the key tangible and intangible cost components and other information discussed in this article series, delivering a basic model for calculating a cost-of-downtime number. Since most decisions on an availability strategy or solution mix are based on a company-wide annual perspective, our worksheet is designed to establish a baseline annual number against which ROI can be assessed.
The worksheet may be accessed by downloading Business Continuity Solution Series White Paper 102, "Understanding Downtime," at http://www.visionsolutions.com/bcss. See the worksheet at the end of the white paper.
Readers using the worksheet are encouraged to reference applicable sections in the White Paper for clarification, as needed. Results should be considered a beginning step in comprehensive downtime analysis.
After a one-month break, our Managed Availability Memo series will continue in November. Now that we've looked at the fundamentals of managed availability and have learned how to assess downtime risk, it is time to begin examining the various solution options that constitute the technology tool box of managed availability. As the months unfold, we will review tape backup, disk mirroring, real-time data and object replication, vaulting, clustering and more.
DM Review Online readers who wish to study managed availability issues and technology in greater depth may subscribe to Vision Solution's Business Continuity Solution Series at www.visionsolutions.com/BCSS.
Register or login for access to this item and much more
All Information Management content is archived after seven days.
Community members receive:
- All recent and archived articles
- Conference offers and updates
- A full menu of enewsletter options
- Web seminars, white papers, ebooks
Already have an account? Log In
Don't have an account? Register for Free Unlimited Access