One of the biggest issues for any cloud computing company is ensuring reliability of the service.We had made massive efforts to ensure that our service would always be up, such as configuring our database to run on several different servers so that if one machine failed, the others would still work, yet in late 2005 our site went down. Customers quickly began to grumble that the service was unreliable. Making matters worse, a competitor signed up for a free trial as a way to ascertain when our service was down—and reported any problems to the press. Literally, within minutes, journalists would call seeking comment. Before long, salesforce.com’s reliability issues were widely publicized, and we were in serious trouble.
During the period we struggled with outages, we actually had an uptime rate of 99 percent, and our service was much better and much more reliable than software, but any disruption was understandably maddening for customers. We lost their faith.
Salesforce.com entered an incredibly challenging time. There were fundamental issues with our technology model, and it was unclear whether or not we’d be able to go forward on our current code base. We questioned if the technology would be able to scale as much as it needed to, and whether or not we would be able to continue to deliver the same level of innovation.
Parker and the team of engineers tirelessly scrambled to fix the problem by working with our vendors at Oracle, Sun, and Veritas and rebuilding the software and executing myriad stability projects. We dedicated all our technology resources to solving this issue. All development on new features temporarily stopped. As the engineers worked around the clock to find a solution, the rest of us were unsure of how to respond to the escalating criticism. No one knew what to say to customers or to the press, but we believed a strategy of minimization and containment would serve us best.
At the time, I felt that our public response was not our primary concern. I thought we needed to focus on improving the technology and to remain as low profile as possible until the problem was resolved. Once everything was fixed, I thought, we could respond with a proper explanation and share good news. We stopped taking calls, and we stopped returning calls. This seemed like the safe response, but it was unlike the way salesforce.com usually operated, which made us feel uncomfortable.”This is not who we are; we are usually the ones calling people up,” Bruce Francis, the vice president of corporate strategy, said to me one day during the height of the crisis. “Hiding doesn’t feel right.”
I had to admit that part of me felt that if we didn’t confirm the problems, they didn’t exist. I had mistakenly assumed that reporters wouldn’t write about the issues if they didn’t have our comment. That was an antiquated assumption, however. Blogging was just taking off, and bloggers don’t adhere to the traditional rules that magazine or newspaper reporters follow, such as holding a story until they get confirmation or comment. After the blogs covered it, the established media picked it up. We realized that silence had been a terrible strategy. And it wasn’t just the decision not to talk that had been an egregious error, it was that we had not talked immediately. Part of the problem was exacerbated by the very nature of SaaS: because we hosted everything, people couldn’t call their own data centers and learn what was happening. Customers were annoyed.
As the crisis built, we gathered our top two hundred fifty managers at an offsite meeting. The reliability of our service was, of course, the most pressing topic. Right then we had our worst outage to date. The system went down, and restarting the huge databases took ninety minutes—an eternity for customers dependent on the service. Customers and the press were clamoring for answers, and it was difficult for them to reach anyone because all our managers were at the off-site meeting.
We had to find a way to communicate quickly and candidly—even if going public with our problems felt like a defeat at the moment. Parker and Bruce urged me to post our internal monitoring system, which we used to track our status (everything running perfectly appeared in green, performance issues were tagged in yellow, and service disruptions were marked in red). It was a bold move and a big leap of faith. We would be allowing the public—and the competition—to see exactly how our system was functioning every day. It meant that we would be sharing embarrassing details every time the system slowed or stopped working. Why would any company make itself vulnerable in that way?
At first I was hesitant. It made sense that customers would see what was going on in real time, but I didn’t think that our reliability information should be available to everybody. I worried that journalists and our competitors would use this information against us. Ultimately, however, I let go of my fear and realized that complete transparency was what we needed if we were to restore trust in our company. It would also encourage good behavior from the organization because it added a new level of accountability and responsibility. In the middle of the disaster, we opened up our internal system for everyone to see. I called it the trust site.
The site — located at trust.salesforce.com — offers real-time information on system performance with up-to-the minute information on planned maintenance, historical information on transaction volume and speed, reports on current and recent phishing and malware attempts, and information on new security technologies and the best security practices. Instead of hiding behind our problems, we started educating customers, prospects, and journalists about where they could find the information they needed. It was liberating not to have to act defensively.
The effort was an instant hit with reporters as they could immediately see for themselves what was happening. We further benefited because it took the “gotcha” weapon away from competitors. Best of all, the trust site gave us an opportunity to talk about something positive — transparency.
There is no question: we would not be around today if we were not always bettering the technology and improving its speed and reliability. (Our service ran at 99.99 percent uptime in the first quarter of 2009, runs more than 200 million transactions a day, and has subsecond response time; and we are constantly making advances to deliver it even faster.) At the same time, I don’t think we would be thriving today had we not shifted to embrace more transparency. The difficult decision to launch the trust site—to ”open the kimono,” as Bruce Francis called it — differentiated us. Transparency and trust became a strong part of our branding and identity.
Reliability is a tech problem, but the way you solve it is not with technology alone — it’s with communication
Once again, we did not invent this solution; we got our inspiration from the consumer world: eBay pioneered this idea with its pages that inform users of outages, glitches, and maintenance upgrades. This had not been done in the corporate world before we did it, though it has since been validated as a best practice for companies of all stripes. When BlackBerry struggled with service outages, many articles pointed to the salesforce.com trust site as a way to deal successfully with these issues. Out of a crisis that had threatened to damage our reputation, we had created powerful differentiation.
Now, we talk about transparency in every pitch we make to the press and to prospective customers. It’s a cornerstone of our messaging.
Today, if our servers are down — even for 20 minutes — we call our top customers. I call many of our customers personally to apologize and share what is happening. Often they are completely surprised to hear from me. One CIO at a very large company told me that he couldn’t believe I was taking the time to call him. He revealed that he had had data centers down for two days, so an hour of downtime didn’t represent a massive problem to him. We found, however, that open communication, in tandem with quickly fixing the problem, is the only way to build and retain trust.
Marc Benioff is chairman and CEO of Salesforce.com . The above excerpt is reprinted by permission of the publisher, John Wiley & Sons, Inc., from “Behind the Cloud: The Untold Story of How Salesforce.com Went from an Idea to a Billion-Dollar Company—and Revolutionized an Industry,” by Marc Benioff. Copyright (c) 2009 by John Wiley & Sons, Inc. All rights reserved.