amazon aws top100 Amazon EC2 Outage Reveals Challenges of Cloud ComputingData center and cloud hosting services, Amazon cloud infrastructure suffered a power outage last week, creating problems for their clients which lasted more than 24 hours.

The blackout was lasted several hours affecting tens of notable sites including Foursquare, Quran, Moby and Reddit. Many large EC2 users ended up losing valuable business data. Chartbeat reports losing 11 hours of historical data to its customers saying it’s “unrecoverable.”

Amazon said the problems were due to a power failure, but did not provide further details on the origin of the problem that caused the blackout.

“A few days ago sent an email letting you know we were working on the recovery of an inconsistent data snapshot of one or more of your Amazon EBS volume. We’re sorry, but ultimately, our efforts to recover the volume manually were unsuccessful. The hardware is not such that it could not restore forensics data.”

As per report, Amazon’s data center in Ashburn, Virginia, lost power for about 30 minutes.

“We can confirm network connectivity issues for some EC2 instances in a single Availability Zone in the US-EAST-1 region,” Amazon reported in its Service Health Dashboard. “Customers may be experiencing impaired read/write access to their EBS (Elastic Block Storage) volumes. New instance launches are also delayed. We are applying mitigations to address the connectivity issues … and connectivity is beginning to recover.”

Amazon further added, “We know how important our business services our customers and we will endeavor to learn from this event and use it to drive improvement in our services.”

It is hard to believe that a cloud service, as reliable as EC2, does not maintain a foolproof backup system. Amazon EC2, Rackspace, Google Apps and Microsoft Azure have had their fair share of breaks in the last 18 months and some of them have been big failures (in April 2011 Amazon interruption lasted 47 hours for some customers).

A recent report from International Working Group on Cloud Computing Resiliency (IWGCR) says customers have suffered 568 hours of downtime from 13 well-known cloud services since 2007, which resulted in $71.7 million of economic loss.

This puts a big question mark over the reliability of the cloud, and objects of the popular perception of infallibility cloud. A system, as redundant as it is, is not immune to failure, human error, software bugs, etc.

Mitigation Steps

Recoverability of the system becomes an important issue when such incidents occur. Many organizations are not serious to restore the system and test it before the incident. Regularly backing up data and store it away from your Primary CSP. For example, you may have an instance of Amazon EC2 to back up the installation of the Rackspace cloud. This will mitigate against a single point of failure.

Create a disaster management system that also includes the preparation of its public relations staff and customer service, establish quality control processes and implementation of a contingency plan at the executive level to avoid the panic of securing business.

In addition, the geographical distribution of a critical application on multiple data centers can prevent network failure located in one datacenter.

Hottest IT Skills in 2013 – Cloud, Mobile and BI
In 2012, more than 1.7 million jobs in the field of cloud computing remained unoccupied, according to analysts firm IDC. READ MORE
How Cloud Computing Influences Digital Marketing
Cloud marketing has the ability to drastically change the ways in which they reach and engage their audience, particularly with regard to distributing and storing mission-critical data. READ MORE
Gartner: BYOD to Take Center Stage For Mobile App Use by 2017
More and more companies encourage their employees to work on their devices, thus reducing the cost of computer equipment, but also increase the cost to maintain licenses and safety. READ MORE
Maturity in the Cloud: Start Thinking Like a Grown-Up
Despite the inclination to wait until all of the cloud’s kinks have been worked out, holding off on cloud initiatives until the industry matures won’t guarantee success. READ MORE
PwC: Cloud, SaaS and Mobile Are Changing Software Industry
The software industry is undergoing major changes by trends such as cloud, SaaS, mobile technology and the “consumerization of IT”. READ MORE
10 Cloud Computing Game Changers
Here are the ten most influential cloud computing companies, and the reason why. READ MORE