In early August, a lightning strike at a transformer owned by one of Amazon’s power suppliers caused a major failure in Amazon Web Services’ European cloud. The power outage affected, among others, the Elastic Compute Cloud (ECS) and Rational Database Service (RDS) cloud services.
According to the Amazon Web Services status page, a transformer from an energy supplier for one of the availability zones (EU-WEST-1 region) in Dublin was struck by lightning. An availability zone is a set of hardware that supports cloud services and that functions independently of other zones. According to the site, the cloud services were weakened by the impact. Since the cloud services are composed of complex software components, Amazon had to assign more hardware to restore its cloud services after the power had been restored.
This is the type of event that causes some businesses to doubt the benefits of cloud computing and to believe that it is not a safe or reliable method of hosting data. While these doubts are understandable, the fact is there are a variety of techniques a cloud computing company can apply to mitigate disruption caused by weather events and resulting power outages.
One option, depending on a client’s business requirements, is for the cloud provider to distribute its servers over multiple availability zones (AZ). This redundancy provides excellent protection from localized emergencies such as a lightning strike, power surge or other severe weather related events.
For other customers, monitoring and frequent backups can mitigate problems when incidents occur. With careful monitoring, engineers will be informed immediately when servers are unreachable. Then, using snapshots of all servers taken on an hourly basis, it is possible to boot the relevant servers in AZs that are available.
Several physical redundancies can also help to mitigate problems. These include redundant power supplies and backup generators that are tested to assure they will kick on when the power fails – unlike Amazon’s generators in Dublin. Using redundant Internet connections running simultaneously provides a backup if one provider fails or is performing poorly. Redundant hardware such as multiple hard drives and other components can be arranged so that, if one fails, another immediately and seamlessly takes its place.
Arjan de Jong is Marketing Manager at Jitscale. He has more than 10 years’ experience in all aspects of marketing for technology companies with a passion for out-of-the-box thinking and leaving the obvious paths. Jitscale provides fully managed, secure, on-demand, global, auto-scaling and virtualized IT infrastructures as a service. Based in the Netherlands, Jitscale is currently expanding into the United States.