Skip to main content

Why You Need to Calculate the Cost and Impacts of Your IT Downtime

By July 17, 2012Article

Editor’s note: The need for IT uptime is moving into more and more business segments, industries, and downstream into smaller businesses. Obviously downtime has an expensive impact on a business; however, Stratus conducted large-scale surveys and found that most businesses don’t adequately calculate the cost and effects of downtime. I talked with Dave Laurello, CEO of Stratus, about this issue and about options for addressing downtime and achieving 100 percent uptime.
SandHill.com: Why don’t companies know or calculate the cost and effects of IT downtime? Do they not have the resources to do this or is it that they don’t know how to determine the cost?
Dave Laurello: People are under a lot of pressure and a lot of priorities. But I continue to be amazed when I meet potential clients and ask them about the cost of their downtime. Most respond, “I don’t know.” They can, however, tell me how much downtime they have on average. They know they were down maybe a couple of hours per month or went down five times in the past year. That’s their benchmark. But if I then ask, “What did that downtime cost you?” that’s a level of detail they’re not aware of.
SandHill.com: Why is it important for businesses to understand the cost of their IT downtime?
Dave Laurello: Because if it goes down, it affects revenue. One of our customers, for example, is a large credit card authorization firm that runs about eight billion transactions per year through our environment. If they’re down for one second, they lose hundreds of thousands of dollars of revenue and it also affects their customer satisfaction. Understanding the cost of their downtime is how they determine the amount of their investments in IT to ensure that they avoid that downtime.
Usually the motivator of deciding whether you’re going to do something about an issue is the cost of the solution. At the end of the day, you have to explain to the CFO or CEO the ROI of what you’re going to do to avoid downtime. You need to be able to have an ROI conversation like: “We had 10 hours of downtime last year. I think there are things we can do to bring that downtime to zero or to 30 minutes, and this is what it’s going to cost us to do that.”
SandHill.com: Am I correct in assuming that if companies don’t calculate the downtime cost in terms of revenue, they might make a bad decision and choose a solution that might cost less but not deliver as much uptime?
Dave Laurello: Absolutely. And unfortunately it’s not just loss of revenue. When it comes to public safety and 911 calls, for instance, there is more at stake in IT downtime than revenue; lives at stake.
Another important element about downtime impacts is the company’s reputation. If your credit card doesn’t work when you go to check out, you’ll use a competitor’s credit card. Because we all have alternative choices in today’s competitive business world, the necessity for IT uptime has now has shifted beyond large enterprises to include small and midsize businesses.
One of my favorite stories is when I went to my local dentist. He couldn’t do the work because his computer was down and he couldn’t do the digitized X-rays. So I had to reschedule my appointment. That was a big inconvenience to me.
Another example is when I sent flowers to my mother on Mother’s Day. After I called her and she didn’t thank me for the flowers, I called the florist on Monday after Mother’s Day and asked what happened. He said his server went down and he didn’t have all the information as to where he needed to send all the flowers.
To the degree that IT is a critical part of delivering a company’s capabilities to its end users and customers, if that goes away for a long enough period of time, there will be a reputation impact. And, of course, if a company’s reputation is tarnished with customers, it will affect revenue.
SandHill.com: Are there any industries in particular that are turning to solutions for uptime more than others?
Dave Laurello: At Stratus one of our fastest-growing segments is manufacturing. It’s very important that the manufacturing facilities lines are up and running 24×7 and they have access to the information. Earlier I mentioned public safety (911 calls) and financial services (credit cards, ATM points of sale, exchanges — businesses where seconds of time relate to money lost or gained). Healthcare and electronic medical records (EMRs) is another fast-growing area. Doctors today usually have a PDA for interfacing with patients’ medical records and also ordering prescriptions from that device. It’s no longer an option to go the paper route if the device goes down.
SandHill.com: How much is the cloud having an impact on uptime? Amazon’s downtime is, of course, famous. Aren’t the cloud hosting providers responsible for investing in infrastructure to ensure uptime?
Dave Laurello: I see the cloud as an excellent alternative for non-mission-critical applications. Companies like the flexibility of cloud technology and its compute-on-demand capability. And they like the cost savings they get from the cloud. But when it comes to their most mission-critical applications where their business is at stake, or security in terms of the data that is being run with their mission-critical application, they’re a bit more hesitant to move to the cloud.
Cloud developers are talking about addressing those issues, but many companies see the public cloud as something that is a little too risky for them. They do look at private clouds for some mission-critical applications.
SandHill.com: How does Stratus help companies achieve greater uptime?
Dave Laurello: We play in the private cloud infrastructure space. We provide resilient, fault-tolerant servers and proactive remote monitoring and management of those servers, which allow companies to implement private clouds that run their mission-critical applications.
Columbia Memorial Hospital, for example, implemented a centralized EMR environment supporting over 300 clinicians around the county (or their service area). They use our product and services to keep that private cloud up 24×7 to run their most mission-critical applications.
SandHill.com: How are your product and services different from others in the market?
Dave Laurello: Most other technologies in this space address failure recovery. With our software for proactive monitoring and management of those environments, we can predict when there are potential problems and remediate them before they turn into failure. So we’re really about failure avoidance versus failure recovery. As a result, 100 percent uptime is absolutely possible.
I can show you Stratus customers (large and small) that have been running for years with us and have had absolutely no downtime. None. Zero. I personally think it’s totally unacceptable for hours or days of IT downtime in the course of doing business.
SandHill.com: Yet, companies experience downtime for hours or days and, in the case of Amazon’s cloud services, for four days. Are companies not aware that they could have 100 percent uptime?
Dave Laurello: There are a lot of analysts and others who write articles saying 100 percent uptime is impossible, and some companies are kind of programmed by what the analysts and press write. As a result, they think it’s acceptable to live with less uptime.
Other companies are willing to take more risks with their IT infrastructure, so they go with alternative technologies such as clusters or standard servers, which cost less. This is perfectly acceptable for applications that aren’t mission critical and don’t demand the highest level of uptime.
SandHill.com: Realistically, what is an acceptable amount of downtime?
Dave Laurello: At Stratus last year we had on average 81 seconds of downtime across our installed base of 8,000 servers. That’s the benchmark that I hold this company to. And I continue to drive the company to bring that down to below 81 seconds. From my perspective, the CIO of a company would use this as the benchmark and say that, on average, the company should be down 81 seconds or less in a year.
SandHill.com: Of course, there are so-called “acts of God” that result in downtime.
Dave Laurello: Yes, there are things that one can’t predict. But other than that, there are things companies can do that make 100 percent uptime possible. It’s everything from the technology you select for servers, the type of service monitoring and management, managed services, how you do patch management and how you administer the servers.
There definitely are methods and things that can be done to address uptime. But you need to have the right motivation to invest in those solutions. Unfortunately in today’s world, the motivation comes down to dollars-and-cents tradeoffs. That’s why it’s important to understand the cost of downtime.
Dave Laurello is president and CEO of Stratus Technologies (@stratus4uptime). He rejoined Stratus in January 2000, coming from Lucent Technologies, where he held the position of vice president and general manager of the CNS business unit. At Lucent, Dave was responsible for engineering, product and business management and marketing. Prior to this, he was vice president of engineering of the carrier signaling and management business unit at Ascend Communications. From 1995 to 1998, Dave was vice president of hardware engineering and product planning at Stratus.
Kathleen Goolsby is managing editor at SandHill.com

Copy link
Powered by Social Snap