April 25, 2011 – Amazon.com Inc. database and cloud services have been restored, and the company is contacting a few remaining customers and outlining a postmortem on the service disruptions that started last week.

In the most recent system status update Sunday night, Amazon noted near-full service recovery for customers of its Elastic Compute Cloud (EC2), an integral part of the company’s Amazon Web Services cloud platform, and its Relational Database Service (RDS).

“The vast majority of affected volumes have now been recovered. We're in the process of contacting a limited number of customers who have EBS volumes that have not yet recovered and will continue to work hard on restoring these remaining volumes,” Amazon posted in separate messages for EC2 and RDS on its service health dashboard.

Both services are located at a Northern Virginia data center. Services operated normally during this time at the 26 other Web services located around the U.S. Aside from the updates on its dashboard, Amazon has not made any public announcements regarding the outages and did not respond to a request for comment.

Errors and latency started showing up Thursday, interrupting operations for highly trafficked websites such as HootSuite, Foursquare, Quora and Reddit. Services were functional for EC2 and RDS as of Friday, though reports of connection problems persisted through the weekend, according to dashboard reports.   

Gartner Web and cloud computing research director Eric Knipp calls the service failure painful, but also a force that should guide transparency with cloud providers. Knipp says companies should adjust their cloud strategy from a one-size-fits-all approach to avoid deeper impacts with the outage experienced at Amazon.

“Ultimately, companies must decide: do I want to architect for resiliency, do I believe I can delegate that resiliency to a provider, or do I just want to take my chances? There is a place for each attitude, depending on project complexity and criticality,” Knipp says.

Dick Csaplar, Aberdeen senior research analyst in the virtualization and storage practice, says that no one currently has a complete handle on the size of the disruptions, though the attention from this downtime is due to Amazon’s status as the “poster child for successful cloud deployment.” Because most companies are not using the Amazon cloud as their primary storage or database source, the impact will be largely felt in the form of delayed data backups and inconveniences with infrastructure as a service, Csaplar says. Csaplar hasn’t heard of any hesitation on cloud deployment from enterprises due to the Amazon disruption.

“We all have to do our homework, we all have to back up our data, but I still think that [Amazon] offers a much higher level of uptime than what most companies are able to generate for themselves,” Csaplar says.

Register or login for access to this item and much more

All Information Management content is archived after seven days.

Community members receive:
  • All recent and archived articles
  • Conference offers and updates
  • A full menu of enewsletter options
  • Web seminars, white papers, ebooks

Don't have an account? Register for Free Unlimited Access