Welcome!

Blog Feed Post

How Retailers Can Prevent Downtime with Incident Management

prevent downtime retail industryhttps://www.pagerduty.com/wp-content/uploads/2017/08/retail_industry-300... 300w, https://www.pagerduty.com/wp-content/uploads/2017/08/retail_industry-250... 250w, https://www.pagerduty.com/wp-content/uploads/2017/08/retail_industry-180... 180w" sizes="(max-width: 434px) 100vw, 434px" />In recent years, it feels as if many major brands have suffered major infrastructure failures during one of the busiest holiday shopping days — Black Friday. Because many of they have.

Seeing big names associated with website crashes and business disruptions may be intimidating to many admins. If large international retailers struggle to keep their infrastructure running smoothly on the busiest shopping day of the year — when they know far in advance that they’ll need all hands on deck — how can smaller companies prevent downtime on a normal day?

That’s a daunting question. Fortunately, it doesn’t mean that hope is lost for everyone. By following the right incident management procedures, even small teams can minimize the impact of inevitable disruptions to business operations.

This post explains how to do that, with a focus on the needs of retailers.

Defining Retailer Priorities

To perform effective monitoring and incident management for a retailer, admins first have to understand what a retailer’s top requirements are when it comes to infrastructure availability and uptime.

For most modern retailers, which have both brick-and-mortar as well as online sales outlets, ensuring the following is essential:

  • Keep customer-facing websites operating. This is difficult because, by definition, customer-facing sites are on the public Internet, where they may be subject to intrusion attempts (DDoS attacks and so on), not to mention the threat of crashing from simple, non-malicious traffic spikes. These sites are also essential for retailers, because they fuel sales. Customers typically use websites to plan future purchases, whether those purchases are ultimately made online or in-store.
  • Keep backend systems running. Backend servers, which handle tasks like keeping track of inventory and storing transaction histories, are also vital for business operations. While they can generally be secured more easily from attackers than public-facing sites because they can run on private networks, backend systems are more vulnerable in other ways. They are likely to contain highly sensitive information, for example; making effective monitoring essential.
  • Ensure uptime of point-of-sale (POS) systems. Brick-and-mortar retailers can’t make sales if their POS terminals crash. Keeping these systems running requires effective management of a complex mix of variables, from local network connectivity to physical security and power supply.
  • Protect IoT assets. As retailers make greater use of the Internet of Things (IoT) to personalize and automate workflows, guaranteeing the uptime and connectivity of the devices and sensors that power retail operations is crucial. In this respect, the move toward highly automated, device-based business operations also raises new challenges for organizations in the context of monitoring.

These are retailers’ primary requirements to ensure completed transactions. Now, let’s discuss how monitoring and incident management can be used to meet key challenges.

Preventing Retailer Downtime

If you want to keep the most important parts of your retail infrastructure running smoothly, you’ll want to adhere to these guidelines:

  • Maximize visibility across the infrastructure. With so many variables in play, retailers tend to have especially complex and diverse IT infrastructure. As noted above, it includes not just public websites, but also backend systems and a variety of special-purpose devices and sensors. To keep track of infrastructure like this, organizations require across-the-board visibility. All monitoring information needs to be centralized into a single location, as that’s the only way to truly make sense of it.
  • Deploy flexible monitoring solutions. A diverse infrastructure also requires diverse monitoring tools. Retailers should make sure that they have monitoring agents installed on all the different parts of their infrastructure, and that the monitoring information they collect is forwarded and normalized within a central management platform.
  • Respond in real-time. For retailers, just a few hours (or even minutes) of downtime on a sales site or POS system has very costly repercussions. In addition to the sales lost as a direct result of the downtime, companies also suffer damage to their reputations. The effects can therefore last months. To mitigate these risks, retailers need to be sure that their incident management systems and workflows enable real-time response powered by actionable insights so that service is restored as quickly as possible.
  • Communicate effectively. One of the challenges of incident management in the retail sector is that a company’s infrastructure tends to be very large and very distributed, especially for retailers that have large networks of stores and warehouses. The admins who keep infrastructure running are likely to be distributed, too. Addressing this challenge requires an incident management system that provides seamless communication tools, and takes advantage of collaborative, shared workflows such as ChatOps. This way, a large team of admins spread out over a wide area can communicate effectively when resolving problems.

It’s safe to say that there will never be a total eradication of the threat of downtime. But modern monitoring and incident management solutions play a key role in helping retailers large and small avoid becoming the next headline about a major service failure.

Download our latest ebook to learn more about incident management for retailers and impact of downtime on retailers, or contact us today.

DOWNLOAD NOW

The post How Retailers Can Prevent Downtime with Incident Management appeared first on PagerDuty.

Read the original blog entry...

More Stories By PagerDuty Blog

PagerDuty’s operations performance platform helps companies increase reliability. By connecting people, systems and data in a single view, PagerDuty delivers visibility and actionable intelligence across global operations for effective incident resolution management. PagerDuty has over 100 platform partners, and is trusted by Fortune 500 companies and startups alike, including Microsoft, National Instruments, Electronic Arts, Adobe, Rackspace, Etsy, Square and Github.

Latest Stories
“Why didn’t testing catch this” must become “How did this make it to testing?” Traditional quality teams are the crutch and excuse keeping organizations from making the necessary investment in people, process, and technology to accelerate test automation. Just like societies that did not build waterways because the labor to keep carrying the water was so cheap, we have created disincentives to automate. In her session at @DevOpsSummit at 20th Cloud Expo, Anne Hungate, President of Daring System...
As DevOps methodologies expand their reach across the enterprise, organizations face the daunting challenge of adapting related cloud strategies to ensure optimal alignment, from managing complexity to ensuring proper governance. How can culture, automation, legacy apps and even budget be reexamined to enable this ongoing shift within the modern software factory?
SYS-CON Events announced today that Elastifile will exhibit at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 - Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Elastifile Cloud File System (ECFS) is software-defined data infrastructure designed for seamless and efficient management of dynamic workloads across heterogeneous environments. Elastifile provides the architecture needed to optimize your hybrid cloud environment, by facilitating efficient...
Most companies are adopting or evaluating container technology - Docker in particular - to speed up application deployment, drive down cost, ease management and make application delivery more flexible overall. As with most new architectures, this dream takes a lot of work to become a reality. Even when you do get your application componentized enough and packaged properly, there are still challenges for DevOps teams to making the shift to continuous delivery and achieving that reduction in cost ...
"Cloud computing is certainly changing how people consume storage, how they use it, and what they use it for. It's also making people rethink how they architect their environment," stated Brad Winett, Senior Technologist for DDN Storage, in this SYS-CON.tv interview at 20th Cloud Expo, held June 6-8, 2017, at the Javits Center in New York City, NY.
"We provide IoT solutions. We provide the most compatible solutions for many applications. Our solutions are industry agnostic and also protocol agnostic," explained Richard Han, Head of Sales and Marketing and Engineering at Systena America, in this SYS-CON.tv interview at @ThingsExpo, held June 6-8, 2017, at the Javits Center in New York City, NY.
SYS-CON Events announced today that GrapeUp, the leading provider of rapid product development at the speed of business, will exhibit at SYS-CON's 21st International Cloud Expo®, which will take place October 31-November 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Grape Up is a software company, specialized in cloud native application development and professional services related to Cloud Foundry PaaS. With five expert teams that operate in various sectors of the market acr...
@DevOpsSummit at Cloud Expo taking place Oct 31 - Nov 2, 2017, at the Santa Clara Convention Center, Santa Clara, CA, is co-located with the 21st International Cloud Expo and will feature technical sessions from a rock star conference faculty and the leading industry players in the world. The widespread success of cloud computing is driving the DevOps revolution in enterprise IT. Now as never before, development teams must communicate and collaborate in a dynamic, 24/7/365 environment. There is ...
SYS-CON Events announced today that Golden Gate University will exhibit at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Since 1901, non-profit Golden Gate University (GGU) has been helping adults achieve their professional goals by providing high quality, practice-based undergraduate and graduate educational programs in law, taxation, business and related professions. Many of its courses are taug...
Recently, IoT seems emerging as a solution vehicle for data analytics on real-world scenarios from setting a room temperature setting to predicting a component failure of an aircraft. Compared with developing an application or deploying a cloud service, is an IoT solution unique? If so, how? How does a typical IoT solution architecture consist? And what are the essential components and how are they relevant to each other? How does the security play out? What are the best practices in formulating...
DevOps at Cloud Expo, taking place October 31 - November 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA, is co-located with 21st Cloud Expo and will feature technical sessions from a rock star conference faculty and the leading industry players in the world. The widespread success of cloud computing is driving the DevOps revolution in enterprise IT. Now as never before, development teams must communicate and collaborate in a dynamic, 24/7/365 environment. There is no time to w...
Cloud adoption is often driven by a desire to increase efficiency, boost agility and save money. All too often, however, the reality involves unpredictable cost spikes and lack of oversight due to resource limitations. In his session at 20th Cloud Expo, Joe Kinsella, CTO and Founder of CloudHealth Technologies, tackled the question: “How do you build a fully optimized cloud?” He will examine: Why TCO is critical to achieving cloud success – and why attendees should be thinking holistically ab...
WebRTC is great technology to build your own communication tools. It will be even more exciting experience it with advanced devices, such as a 360 Camera, 360 microphone, and a depth sensor camera. In his session at @ThingsExpo, Masashi Ganeko, a manager at INFOCOM Corporation, will introduce two experimental projects from his team and what they learned from them. "Shotoku Tamago" uses the robot audition software HARK to track speakers in 360 video of a remote party. "Virtual Teleport" uses a...
SYS-CON Events announced today that Cloud Academy named "Bronze Sponsor" of 21st International Cloud Expo which will take place October 31 - November 2, 2017 at the Santa Clara Convention Center in Santa Clara, CA. Cloud Academy is the industry’s most innovative, vendor-neutral cloud technology training platform. Cloud Academy provides continuous learning solutions for individuals and enterprise teams for Amazon Web Services, Microsoft Azure, Google Cloud Platform, and the most popular cloud com...
In his session at @ThingsExpo, Arvind Radhakrishnen discussed how IoT offers new business models in banking and financial services organizations with the capability to revolutionize products, payments, channels, business processes and asset management built on strong architectural foundation. The following topics were covered: How IoT stands to impact various business parameters including customer experience, cost and risk management within BFS organizations.