Blog Feed Post

5 Things I Didn’t Expect to See While Shadowing On-Call

Shadowing on-call engineers feels like watching open-heart surgery over a doctor’s shoulder. Precision and speed are paramount, and mistakes can lead to severe consequences. As part of my, “Welcome-to-PagerDuty” experience, I was added to a PagerDuty engineering team’s on-call rotation as a shadow observer for a week.

During my time on-call, I was on the same escalation policies and received the same notifications as everyone else on the rotation. I would wake up if engineers on-call were woken up by a notification and lurk in the operations Slack channel as they would work through urgent time-sensitive situations. I expected to see stress bring out the worst in people. To my surprise, I was completely wrong.

Phones buzzed to interrupt sleep, meals, thoughts, and so much more — chat channels filled with rapid fire messages and resolution steps unfolded at all hours of the night. I was surprised to see kindness, support, and humor — despite the on-call responsibilities and stress.


https://www.pagerduty.com/wp-content/uploads/2017/09/shouldnt-be-hard-30... 300w, https://www.pagerduty.com/wp-content/uploads/2017/09/shouldnt-be-hard-25... 250w, https://www.pagerduty.com/wp-content/uploads/2017/09/shouldnt-be-hard-18... 180w" sizes="(max-width: 710px) 100vw, 710px" />

Source: https://xkcd.com/1349/


One simple sentence makes it easier to get through hours of stress.

In the late afternoon of my first-day shadowing, I got my first notification. I hopped on the Slack channel and saw that the investigation had already settled on diagnosing an issue from a recent deploy. The team immediately engaged, methodically pausing and provisioning new machines. In the flurry of triaging and diagnosing, I realized that on-call responder was as new to this situation as I was; we had both started less than one month ago. He was fixing a problem that he had never seen before.

In the early evening, things were getting back on track. It had been a long day and people were on their way home. Still, a veteran engineer (P) checked in and cheered on his teammate (K):

P [6:14 PM] Is everything done???

K[6:15 PM] Not yet, just in the middle of restarting the hosts. Everything is going smoothly so far though

P [6:18 PM] Way to go K! This sh*t used to take us half a day!

This type of peer encouragement creates confidence for people new to the project, new to the company, or new to the profession. Encouragement from senior engineers helps less experienced engineers feel brave enough to ask more questions. Asking questions is how new people learn and ramp up to become experts. It starts with empowering people: one “way to go” at a time.

(2) Humility and (3) Gratitude

The newbie on-call engineer in my anecdote immediately responded to the encouragement. He acknowledged and appreciated the senior engineers who had helped him. In three quick sentences, he demonstrated humility and gratitude.

K:[6:21 PM] No way. I would’ve sank if it weren’t for you and M. So much to learn!

I like that K called out another teammate who had helped him. DevOps engineering teams deeply value their teammates. More than just colleagues, they rely on each other to make sure that the software they’re writing today won’t create a hellish on-call rotation tomorrow. No one person can possibly know everything all the time; they have to constantly learn from each other and lean on each other for help.  

In the example above, observe the fine line between humility and self-deprecation. The engineer mentions that he has ‘so much to learn’. This attitude is different from ‘I’ll never do this myself’ or ‘I’ll never understand all of this stuff.’ Being humble and grateful builds trust between teammates. Self-deprecation can sow uncertainty and uneasiness.


(4) Levity and (5) So many gifs

https://www.pagerduty.com/wp-content/uploads/2017/09/alternate-currency-... 300w, https://www.pagerduty.com/wp-content/uploads/2017/09/alternate-currency-... 250w, https://www.pagerduty.com/wp-content/uploads/2017/09/alternate-currency-... 180w" sizes="(max-width: 433px) 100vw, 433px" />

Source: https://xkcd.com/512/


Being on-call can be repetitive and (blissfully) boring. On-call predictability reduces chaos and stress, but there will always be some inherent anxiety associated with being on-call. Luckily, the Internet granted on-call teams (and the rest of us) gifs to dissipate stressful on-call shifts with delightful bursts of humor on loop.

PagerDuty has a strong culture of humor and a particular passion for custom gifs. I loved observing this team, distributed across three separate time zones and two different countries, have fun and be goofy with each other. I noticed that people who weren’t currently on-call were consistently present and upbeat.

https://www.pagerduty.com/wp-content/uploads/2017/09/trustworthiness-274... 274w, https://www.pagerduty.com/wp-content/uploads/2017/09/trustworthiness-228... 228w, https://www.pagerduty.com/wp-content/uploads/2017/09/trustworthiness-164... 164w" sizes="(max-width: 433px) 100vw, 433px" />

Source: https://xkcd.com/1301/


So, how can you make your team’s on-call experience more positive?

  • Be present for your team, even if you’re not currently on-call
  • Cheer people on, especially people who are new to your team
  • Acknowledge when something is challenging, risky, or time-consuming
  • Be specific with your thanks — acknowledge the people who directly impacted you. It will help the team as a whole understand who has which strengths in future situations.
  • Be willing to get silly. Everyone loves to laugh.

I saw plenty of problem-solving, some heartache and frustration, and even parts of people’s weekends and evenings absorbed in being on-call. In the midst of it, small gestures from teammates remind you that you’re not alone.

When it feels like the sky is falling, it helps when people are excellent to each other.

For more on how to engender empathy in DevOps, check out our post on #HugOps.

The post 5 Things I Didn’t Expect to See While Shadowing On-Call appeared first on PagerDuty.

Read the original blog entry...

More Stories By PagerDuty Blog

PagerDuty’s operations performance platform helps companies increase reliability. By connecting people, systems and data in a single view, PagerDuty delivers visibility and actionable intelligence across global operations for effective incident resolution management. PagerDuty has over 100 platform partners, and is trusted by Fortune 500 companies and startups alike, including Microsoft, National Instruments, Electronic Arts, Adobe, Rackspace, Etsy, Square and Github.

Latest Stories
Dion Hinchcliffe is an internationally recognized digital expert, bestselling book author, frequent keynote speaker, analyst, futurist, and transformation expert based in Washington, DC. He is currently Chief Strategy Officer at the industry-leading digital strategy and online community solutions firm, 7Summits.
DXWorldEXPO LLC announced today that Dez Blanchfield joined the faculty of CloudEXPO's "10-Year Anniversary Event" which will take place on November 11-13, 2018 in New York City. Dez is a strategic leader in business and digital transformation with 25 years of experience in the IT and telecommunications industries developing strategies and implementing business initiatives. He has a breadth of expertise spanning technologies such as cloud computing, big data and analytics, cognitive computing, m...
Digital Transformation and Disruption, Amazon Style - What You Can Learn. Chris Kocher is a co-founder of Grey Heron, a management and strategic marketing consulting firm. He has 25+ years in both strategic and hands-on operating experience helping executives and investors build revenues and shareholder value. He has consulted with over 130 companies on innovating with new business models, product strategies and monetization. Chris has held management positions at HP and Symantec in addition to ...
Cloud-enabled transformation has evolved from cost saving measure to business innovation strategy -- one that combines the cloud with cognitive capabilities to drive market disruption. Learn how you can achieve the insight and agility you need to gain a competitive advantage. Industry-acclaimed CTO and cloud expert, Shankar Kalyana presents. Only the most exceptional IBMers are appointed with the rare distinction of IBM Fellow, the highest technical honor in the company. Shankar has also receive...
DXWorldEXPO LLC announced today that Kevin Jackson joined the faculty of CloudEXPO's "10-Year Anniversary Event" which will take place on November 11-13, 2018 in New York City. Kevin L. Jackson is a globally recognized cloud computing expert and Founder/Author of the award winning "Cloud Musings" blog. Mr. Jackson has also been recognized as a "Top 100 Cybersecurity Influencer and Brand" by Onalytica (2015), a Huffington Post "Top 100 Cloud Computing Experts on Twitter" (2013) and a "Top 50 C...
There is a huge demand for responsive, real-time mobile and web experiences, but current architectural patterns do not easily accommodate applications that respond to events in real time. Common solutions using message queues or HTTP long-polling quickly lead to resiliency, scalability and development velocity challenges. In his session at 21st Cloud Expo, Ryland Degnan, a Senior Software Engineer on the Netflix Edge Platform team, will discuss how by leveraging a reactive stream-based protocol,...
Enterprises have taken advantage of IoT to achieve important revenue and cost advantages. What is less apparent is how incumbent enterprises operating at scale have, following success with IoT, built analytic, operations management and software development capabilities - ranging from autonomous vehicles to manageable robotics installations. They have embraced these capabilities as if they were Silicon Valley startups.
Daniel Jones is CTO of EngineerBetter, helping enterprises deliver value faster. Previously he was an IT consultant, indie video games developer, head of web development in the finance sector, and an award-winning martial artist. Continuous Delivery makes it possible to exploit findings of cognitive psychology and neuroscience to increase the productivity and happiness of our teams.
The standardization of container runtimes and images has sparked the creation of an almost overwhelming number of new open source projects that build on and otherwise work with these specifications. Of course, there's Kubernetes, which orchestrates and manages collections of containers. It was one of the first and best-known examples of projects that make containers truly useful for production use. However, more recently, the container ecosystem has truly exploded. A service mesh like Istio addr...
Poor data quality and analytics drive down business value. In fact, Gartner estimated that the average financial impact of poor data quality on organizations is $9.7 million per year. But bad data is much more than a cost center. By eroding trust in information, analytics and the business decisions based on these, it is a serious impediment to digital transformation.
As DevOps methodologies expand their reach across the enterprise, organizations face the daunting challenge of adapting related cloud strategies to ensure optimal alignment, from managing complexity to ensuring proper governance. How can culture, automation, legacy apps and even budget be reexamined to enable this ongoing shift within the modern software factory? In her Day 2 Keynote at @DevOpsSummit at 21st Cloud Expo, Aruna Ravichandran, VP, DevOps Solutions Marketing, CA Technologies, was jo...
Predicting the future has never been more challenging - not because of the lack of data but because of the flood of ungoverned and risk laden information. Microsoft states that 2.5 exabytes of data are created every day. Expectations and reliance on data are being pushed to the limits, as demands around hybrid options continue to grow.
Business professionals no longer wonder if they'll migrate to the cloud; it's now a matter of when. The cloud environment has proved to be a major force in transitioning to an agile business model that enables quick decisions and fast implementation that solidify customer relationships. And when the cloud is combined with the power of cognitive computing, it drives innovation and transformation that achieves astounding competitive advantage.
Digital Transformation: Preparing Cloud & IoT Security for the Age of Artificial Intelligence. As automation and artificial intelligence (AI) power solution development and delivery, many businesses need to build backend cloud capabilities. Well-poised organizations, marketing smart devices with AI and BlockChain capabilities prepare to refine compliance and regulatory capabilities in 2018. Volumes of health, financial, technical and privacy data, along with tightening compliance requirements by...
As IoT continues to increase momentum, so does the associated risk. Secure Device Lifecycle Management (DLM) is ranked as one of the most important technology areas of IoT. Driving this trend is the realization that secure support for IoT devices provides companies the ability to deliver high-quality, reliable, secure offerings faster, create new revenue streams, and reduce support costs, all while building a competitive advantage in their markets. In this session, we will use customer use cases...