Welcome!

Blog Feed Post

Downtime Report: Top Ten Outages in 2013

neverfail_outages_final

2013 has seen some massive outages; and given our heavy reliance on technology today, there is more at stake than ever before. Outages affect not only internal users, but a company’s customers and partners – and impact revenue, credibility, trust, reputation and productivity.

With all of this in mind, we at Neverfail wanted to put together our own list of the year’s top outages – placing them on a scale based on overall impact of their downtime. The criteria we used to assess the scale ranges across multiple factors:

Expansive Reach

Businesses increasingly depend on the cloud for applications and access to data, so there’s more at stake today than before. In today’s interconnected world, an outage can have a rippling effect across a company’s user base, the country and even the globe.

Damaged Reputation

No one is perfect. But as customers become increasingly dependent upon the cloud for applications and access to their data, perfection is exactly what those customers demand. So, big outages draw a lot of media attention and can quickly put a company under attack. Not to mention the fact that user forums and social media platforms like Twitter have become the automatic and all-encompassing soap box for all irritated customers to expound their complaints. Companies that rely on other cloud platforms to provide their own products and services may also see their reputation adversely affected if the cloud provider has an outage.

Lost Revenue

It is near impossible to determine the exact cost of downtime, since so much depends on the organization, the industry, the number of people impacted, etc. For example: a Standish Study estimated that credit card applications lose around $2.6 million for every hour of downtime, whereas this year’s 49 minute Amazon.com outage reportedly cost the online retail website nearly $5 million in deferred revenue.

It also appears that the cost of downtime is increasing. According to Gartner, in 2005 organizations lost $42,000 every hour of downtime. In 2011, it was estimated that IT downtime costs $26.5 billion in lost revenue each year and another study suggested that the average cost of data center downtime across industries is approximately $5,600 per minute.

By that estimate, the top ten outages equal a whopping $31,214,400 in lost revenue – and that only accounts for the providers themselves, not their end customers. Ouch!

We summarized the top 10 outages in an infographic and have also listed them below. We actually reviewed over 30 major outages; we’ll publish that list and some additional analysis next month.

Our index analysis is based on publicly available information and subsequent analysis of each failure.

1.       Microsoft’s Windows Azure

  • Date: October 30, 2013
  • Duration: Over 20 hours
  • Failure: A sub-component of the system failed worldwide
  • Impact: Every single Azure region was affected (including West US, West Europe, Southeast Asia, South Central US, North Europe, North Central US, East Asia, and East US)

2.       Google

  1. Date: August 16, 2013
  2. Duration: less than 5 minutes
  3. Failure: All of its services went down
  4. Fallout: The volume of global Internet traffic plunged by about 40%

3.       Amazon Web Services

  • Date: Sept. 13, 2013
  • Duration: Under 3 hours
  • Failure: Connectivity issues affected a single availability zone, disrupting a notable portion of Internet activity.
  • Reminder: If you rely heavily on the cloud for your infrastructure, have a failover plan.

4.       NASDAQ

  • Date: August 22, 2013
  • Duration:  3 hours
  • Failure: A software bug, followed by inadequate built-in redundancy capabilities, triggered a massive trading halt in the U.S.
  • Impact: With all the exchanges dependent on one another, this outage had impact rippling across the globe

5.       OTC Markets Group Inc.

  • Date: November 7, 2013
  • Duration: over 5 hours
  • Failure: A network failure due to a “lack of current quotation information,” prompted a complete shutdown in trading of over-the-counter stocks in the U.S.
  • Impact: The shutdown happened on one of the biggest trading sessions this year as Twitter Inc.’s shares debuted. While the disruption only paused less significant equities such as Fannie Mae and Freddie Mac, it tested investors’ nerves following a series of technical mishaps since August and exacerbated concerns about problems in the electronic infrastructure underpinning U.S. exchanges.

6.       HealthCare.gov

  • Date: October 27-28, 2013
  • Duration: 16+ hours
  • Failure: A service outage at a Verizon Terremark data center caused downtime for HealthCare.gov., the trouble-plagued online insurance marketplace created by the Affordable Care Act
  • Impact: With all of America watching the progress of the trouble-plagued online insurance marketplace created by the Affordable Care Act, a data center outage only add more fuel to the flame and perhaps make the public question where to point the finger of blame.

7.       Amazon.com

  • Date: January 31, 2013
  • Duration: 49 minutes
  • Failure: Internal issues caused the Amazon.com home page to go down, displaying an error message.
  • Impact: The outage demonstrated the extremely high value of uptime to services such as Amazon. Analysts calculated that one hour of interrupted service may have translated to $5 million in lost revenue.

8.       Microsoft – Hotmail And Outlook.com

  • Date: March 13, 2013
  • Duration: nearly 16 hours
  • Failure: A firmware update caused the company’s servers to overheat; Hotmail and Outlook.com both suffered a loss of service.
  • Impact: Microsoft admitted that it required some human intervention to bring the services back online, thus delaying the restoration attempt further. Microsoft’s online service reputation took a big hit.

9.       Google Drive

  • Date: March 18-20, 2013
  • Duration: 17 hours total
  • Failure: A glitch in the company’s network control software, which caused latency and recovery problems. Users faced slow load times or full-on timeouts while trying to access their Drive documents and files.
  • Impact: As much as one-third of the customer base was impacted, leading to a virtual hue-and-cry across the Internet.

10.   Google’s Gmail

  • Date: September 23, 2013
  • Duration: 12 hours
  • Failure: Prolonged slow download times were triggered by a dual network failure.
  • Impact: The outage affected 29% of users. For 1.5% of Gmail messages, the delay in downloading large attachments was up to two hours. While its impact may not have been catastrophic, the outage at Gmail is a potential cause for concern, especially as businesses are turning to Google and other providers to run cloud-based email and SaaS.

While our index measures through the end of November, the very recent Yahoo Mail outage deserves a considerable honorable mention:

Yahoo Mail

  • Date: December 9-13, 2013
  • Duration: almost 4 days
  • Failure: A specific hardware problem in one of the company’s storage systems caused the prolonged partial email outage for users.
  • Impact: The multiday email outage impacted countless individuals and the many small businesses that rely on the service. Not only did the outage cast a dark shadow over the once-mighty Internet player, but the company was also majorly criticized for the way it handled its damage control, particularly its negligence in informing its users about the problems.

 

 

Read the original blog entry...

More Stories By Josh Mazgelis

Josh Mazgelis is senior product marketing manager at Neverfail. He has been working in the storage and disaster recovery industries for close to two decades and brings a wide array of knowledge and insight to any technology conversation.

Prior to joining Neverfail, Josh worked as a product manager and senior support engineer at Computer Associates. Before working at CA, he was a senior systems engineer at technology companies such as XOsoft, Netflix, and Quantum Corporation. Josh graduated from Plymouth State University with a bachelor’s degree in applied computer science and enjoys working with virtualization and disaster recovery.

Latest Stories
Your homes and cars can be automated and self-serviced. Why can't your storage? From simply asking questions to analyze and troubleshoot your infrastructure, to provisioning storage with snapshots, recovery and replication, your wildest sci-fi dream has come true. In his session at @DevOpsSummit at 20th Cloud Expo, Dan Florea, Director of Product Management at Tintri, provided a ChatOps demo where you can talk to your storage and manage it from anywhere, through Slack and similar services with...
"At the keynote this morning we spoke about the value proposition of Nutanix, of having a DevOps culture and a mindset, and the business outcomes of achieving agility and scale, which everybody here is trying to accomplish," noted Mark Lavi, DevOps Solution Architect at Nutanix, in this SYS-CON.tv interview at @DevOpsSummit at 20th Cloud Expo, held June 6-8, 2017, at the Javits Center in New York City, NY.
"We were founded in 2003 and the way we were founded was about good backup and good disaster recovery for our clients, and for the last 20 years we've been pretty consistent with that," noted Marc Malafronte, Territory Manager at StorageCraft, in this SYS-CON.tv interview at 20th Cloud Expo, held June 6-8, 2017, at the Javits Center in New York City, NY.
"We are an IT services solution provider and we sell software to support those solutions. Our focus and key areas are around security, enterprise monitoring, and continuous delivery optimization," noted John Balsavage, President of A&I Solutions, in this SYS-CON.tv interview at 20th Cloud Expo, held June 6-8, 2017, at the Javits Center in New York City, NY.
"We provide IoT solutions. We provide the most compatible solutions for many applications. Our solutions are industry agnostic and also protocol agnostic," explained Richard Han, Head of Sales and Marketing and Engineering at Systena America, in this SYS-CON.tv interview at @ThingsExpo, held June 6-8, 2017, at the Javits Center in New York City, NY.
"DivvyCloud as a company set out to help customers automate solutions to the most common cloud problems," noted Jeremy Snyder, VP of Business Development at DivvyCloud, in this SYS-CON.tv interview at 20th Cloud Expo, held June 6-8, 2017, at the Javits Center in New York City, NY.
"We want to show that our solution is far less expensive with a much better total cost of ownership so we announced several key features. One is called geo-distributed erasure coding, another is support for KVM and we introduced a new capability called Multi-Part," explained Tim Desai, Senior Product Marketing Manager at Hitachi Data Systems, in this SYS-CON.tv interview at 20th Cloud Expo, held June 6-8, 2017, at the Javits Center in New York City, NY.
There is a huge demand for responsive, real-time mobile and web experiences, but current architectural patterns do not easily accommodate applications that respond to events in real time. Common solutions using message queues or HTTP long-polling quickly lead to resiliency, scalability and development velocity challenges. In his session at 21st Cloud Expo, Ryland Degnan, a Senior Software Engineer on the Netflix Edge Platform team, will discuss how by leveraging a reactive stream-based protocol,...
SYS-CON Events announced today that Calligo, an innovative cloud service provider offering mid-sized companies the highest levels of data privacy and security, has been named "Bronze Sponsor" of SYS-CON's 21st International Cloud Expo ®, which will take place on Oct 31 - Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Calligo offers unparalleled application performance guarantees, commercial flexibility and a personalised support service from its globally located cloud plat...
DevOps at Cloud Expo, taking place October 31 - November 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA, is co-located with 21st Cloud Expo and will feature technical sessions from a rock star conference faculty and the leading industry players in the world. The widespread success of cloud computing is driving the DevOps revolution in enterprise IT. Now as never before, development teams must communicate and collaborate in a dynamic, 24/7/365 environment. There is no time to w...
Internet of @ThingsExpo, taking place October 31 - November 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA, is co-located with 21st Cloud Expo and will feature technical sessions from a rock star conference faculty and the leading industry players in the world. The Internet of Things (IoT) is the most profound change in personal and enterprise IT since the creation of the Worldwide Web more than 20 years ago. All major researchers estimate there will be tens of billions devic...
"The Striim platform is a full end-to-end streaming integration and analytics platform that is middleware that covers a lot of different use cases," explained Steve Wilkes, Founder and CTO at Striim, in this SYS-CON.tv interview at 20th Cloud Expo, held June 6-8, 2017, at the Javits Center in New York City, NY.
"With Digital Experience Monitoring what used to be a simple visit to a web page has exploded into app on phones, data from social media feeds, competitive benchmarking - these are all components that are only available because of some type of digital asset," explained Leo Vasiliou, Director of Web Performance Engineering at Catchpoint Systems, in this SYS-CON.tv interview at DevOps Summit at 20th Cloud Expo, held June 6-8, 2017, at the Javits Center in New York City, NY.
21st International Cloud Expo, taking place October 31 - November 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA, will feature technical sessions from a rock star conference faculty and the leading industry players in the world. Cloud computing is now being embraced by a majority of enterprises of all sizes. Yesterday's debate about public vs. private has transformed into the reality of hybrid cloud: a recent survey shows that 74% of enterprises have a hybrid cloud strategy. Me...
SYS-CON Events announced today that DXWorldExpo has been named “Global Sponsor” of SYS-CON's 21st International Cloud Expo, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Digital Transformation is the key issue driving the global enterprise IT business. Digital Transformation is most prominent among Global 2000 enterprises and government institutions.