Welcome!

Blog Feed Post

OnPage’s 3 Steps to Mastering IT On-Call Scheduling

IT on-call

Almost half of all technology professionals experience on-call as an integral part of their job. Life for an IT on-call often means 2 am wake up calls for false alarms or for issues the engineer can do little about.  The results of these sorts of sleep interruptions and tensions inevitably lead to alert fatigue which is considered to be the #1 pain point for both traditional IT teams as well as modern DevOps engineers.

Previous guides have failed to focus on the salient issues that need to be addressed in order to move the conversation forward. As such, OnPage is putting forth the following to highlight the issues that need to be discussed and provide solutions to help improve life on call.

The goal of this blog is to:

  • note what has impeded us from reaching effective life on-call
  • provide 3 steps to mastering life on-call
  • highlight what will be achieved with effective life on-call

Issues impeding effective life on call

Email

Email remains the number one channel people learn about problems. However, this is the worst way to learn about an issue. Email often gets buried under many other messages so it provides the recipient with no immediacy. Furthermore, there is no easily separate communications on a particular incident in an email channel.

Alert Noise

As more technologies get added to the IT stack, the number of items being monitored is vastly increasing. This need to monitor more things than we used to is often referred to as ‘alert hell’ and it is only going to increase exponentially in the future. In fact, large IT organizations can receive up to 150,000 alerts per day from their monitoring systems. It is physically impossible for teams to respond to this number of alerts.

Inefficient Communication

When you are unable to effectively reach engineers or colleagues and don’t know who is on-call, your ability to effectively resolve problems drastically decreases. Additionally, not having the tools to exchange information quickly is also a significant problem. If on-call engineers do have effective communication tools at their finger tips, they are much more productive in managing their on-call shifts and solving problems quickly.

Improving life for IT on-call

More than limiting the number of alerts to the on-call team, the goal of on-call is to limit disruption to the end customer. To this end, a pageable alert is only fired when action must be taken. Anything that doesn’t take place in that context, is a ticket.

Step 1: Create a fair on-call schedule

Use group schedules to make sure everyone gets a chance at bat. Rotations are key in this regard as they ensure everyone is put on-call at some point during a normal schedule. Moreover, a fair schedule will promote the sense that no one group is being picked on or forced to work more hours than any other.

Step 2: Make sure alerts are persistent

How many times has someone on your team said they didn’t respond to the alert because they didn’t hear it? Most alerting technologies notify engineers via SMS or email and don’t provide persistent alerting if the engineer is temporarily out of range.

Instead, make sure you are using a tool that avoids these problems and instead creates persistent alerts that will be heard. Additionally, make sure the alerts will be heard when the engineer comes back into range.

Step 3: Messaging for efficient communications

Make sure the on-call communications tools you use enable communications between engineers.  That is, make sure they have the right tool which will enable both alerting and critical communications. Engineers should be able to message fellow engineers as well as groups.

Ideally, your messaging platform will also integrate with widely used industry tools such as Slack. From Slack, for example, engineers could alert individuals to significant events that need their colleague’s input.

Conclusion

Life on-call doesn’t need to remind everyone of a Stephen King horror novel. Instead, with adequate forethought, life on call can actually be manageable and lead to a decrease in alert fatigue.

Want to read 4 more steps to improve on-call scheduling? Download our whitepaper.

The post OnPage’s 3 Steps to Mastering IT On-Call Scheduling appeared first on OnPage.

Read the original blog entry...

More Stories By OnPage Blog

OnPage is a disruptive technology and application that leverages today's technology and smartphone capabilities for priority mobile messaging. With a top notch history of ensuring uninterrupted communication for businesses and critical response organizations, OnPage is once again poised to pioneer new mobile communications methodology for business and organizational use.

Latest Stories
"We are focused on SAP running in the clouds, to make this super easy because we believe in the tremendous value of those powerful worlds - SAP and the cloud," explained Frank Stienhans, CTO of Ocean9, Inc., in this SYS-CON.tv interview at 20th Cloud Expo, held June 6-8, 2017, at the Javits Center in New York City, NY.
"Peak 10 is a hybrid infrastructure provider across the nation. We are in the thick of things when it comes to hybrid IT," explained , Chief Technology Officer at Peak 10, in this SYS-CON.tv interview at 20th Cloud Expo, held June 6-8, 2017, at the Javits Center in New York City, NY.
"I think DevOps is now a rambunctious teenager – it’s starting to get a mind of its own, wanting to get its own things but it still needs some adult supervision," explained Thomas Hooker, VP of marketing at CollabNet, in this SYS-CON.tv interview at DevOps Summit at 20th Cloud Expo, held June 6-8, 2017, at the Javits Center in New York City, NY.
"We are still a relatively small software house and we are focusing on certain industries like FinTech, med tech, energy and utilities. We help our customers with their digital transformation," noted Piotr Stawinski, Founder and CEO of EARP Integration, in this SYS-CON.tv interview at 20th Cloud Expo, held June 6-8, 2017, at the Javits Center in New York City, NY.
"We've been engaging with a lot of customers including Panasonic, we've been involved with Cisco and now we're working with the U.S. government - the Department of Homeland Security," explained Peter Jung, Chief Product Officer at Pulzze Systems, in this SYS-CON.tv interview at @ThingsExpo, held June 6-8, 2017, at the Javits Center in New York City, NY.
In the enterprise today, connected IoT devices are everywhere – both inside and outside corporate environments. The need to identify, manage, control and secure a quickly growing web of connections and outside devices is making the already challenging task of security even more important, and onerous. In his session at @ThingsExpo, Rich Boyer, CISO and Chief Architect for Security at NTT i3, discussed new ways of thinking and the approaches needed to address the emerging challenges of security i...
"We're here to tell the world about our cloud-scale infrastructure that we have at Juniper combined with the world-class security that we put into the cloud," explained Lisa Guess, VP of Systems Engineering at Juniper Networks, in this SYS-CON.tv interview at 20th Cloud Expo, held June 6-8, 2017, at the Javits Center in New York City, NY.
"I will be talking about ChatOps and ChatOps as a way to solve some problems in the DevOps space," explained Himanshu Chhetri, CTO of Addteq, in this SYS-CON.tv interview at @DevOpsSummit at 20th Cloud Expo, held June 6-8, 2017, at the Javits Center in New York City, NY.
In his session at 20th Cloud Expo, Mike Johnston, an infrastructure engineer at Supergiant.io, discussed how to use Kubernetes to set up a SaaS infrastructure for your business. Mike Johnston is an infrastructure engineer at Supergiant.io with over 12 years of experience designing, deploying, and maintaining server and workstation infrastructure at all scales. He has experience with brick and mortar data centers as well as cloud providers like Digital Ocean, Amazon Web Services, and Rackspace. H...
"We are an IT services solution provider and we sell software to support those solutions. Our focus and key areas are around security, enterprise monitoring, and continuous delivery optimization," noted John Balsavage, President of A&I Solutions, in this SYS-CON.tv interview at 20th Cloud Expo, held June 6-8, 2017, at the Javits Center in New York City, NY.
What sort of WebRTC based applications can we expect to see over the next year and beyond? One way to predict development trends is to see what sorts of applications startups are building. In his session at @ThingsExpo, Arin Sime, founder of WebRTC.ventures, discussed the current and likely future trends in WebRTC application development based on real requests for custom applications from real customers, as well as other public sources of information.
Your homes and cars can be automated and self-serviced. Why can't your storage? From simply asking questions to analyze and troubleshoot your infrastructure, to provisioning storage with snapshots, recovery and replication, your wildest sci-fi dream has come true. In his session at @DevOpsSummit at 20th Cloud Expo, Dan Florea, Director of Product Management at Tintri, provided a ChatOps demo where you can talk to your storage and manage it from anywhere, through Slack and similar services with...
The financial services market is one of the most data-driven industries in the world, yet it’s bogged down by legacy CPU technologies that simply can’t keep up with the task of querying and visualizing billions of records. In his session at 20th Cloud Expo, Karthik Lalithraj, a Principal Solutions Architect at Kinetica, discussed how the advent of advanced in-database analytics on the GPU makes it possible to run sophisticated data science workloads on the same database that is housing the rich...
DevOps at Cloud Expo, taking place October 31 - November 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA, is co-located with 21st Cloud Expo and will feature technical sessions from a rock star conference faculty and the leading industry players in the world. The widespread success of cloud computing is driving the DevOps revolution in enterprise IT. Now as never before, development teams must communicate and collaborate in a dynamic, 24/7/365 environment. There is no time to w...
SYS-CON Events announced today that Massive Networks will exhibit at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Massive Networks mission is simple. To help your business operate seamlessly with fast, reliable, and secure internet and network solutions. Improve your customer's experience with outstanding connections to your cloud.