Blog Feed Post

The Future of the NOC

One of the best things about working at PagerDuty is that our customers, our users, our champions, and our buyers are all the same people. With this year’s push into major incident response, we’ve spent a lot of time talking to Network Operation Centers (NOCs) about what the future holds for them.

Every job changes with new technology — some, like long-distance trucking will be completely disrupted by self-driving trucks — but after all the discussions we’ve had with the best NOCs around, it looks like their evolution will be significant but manageable.

I’ve always thought about PagerDuty as helping your Mean Time To Promotion, in keeping with that, here are some of the possible futures we see for NOCs.

Site Reliability Engineer

One of the most straightforward paths is towards becoming a Site Reliability Engineer (SRE).

If you want a job doing this, you need all the troubleshooting skills of a systems admin, layered on with a deep understanding of monitoring. The goal of an SRE is to detect glitches before they develop into problems that users can notice. And if that doesn’t work, SREs moves heaven and earth to get everything back online. You’ll frequently see SRE positions at big cloud or online companies, like Amazon, Google, Heroku, and even Etsy. People get really cranky if they can’t buy things immediately, and SREs are there to make sure they can.

SREs keep the world online (ok, that’s kind of a big ask). As an SRE, you would work with a team to predict needs and build scale in a way that is fluid and invisible from the front end. Site Reliability Engineering is the art of never letting the user see you sweat, as a company. You’re working to make sure there is always enough capacity, enough uptime, enough pipe, and enough monitoring to make sure something isn’t falling apart invisibly.

Instead of firefighting, you want to be a building inspector, designing wider hallways, doors that always swing out, and multiple staircases (metaphorically). It may look heroic to jump in with a fire ax and a hose and tear down doors and fight flashovers, but it’s better to never need the heroics if you have smart policies around building materials and building sprinklers.

Ops becomes QA

Historically, quality assurance (QA) at software companies has had an unfair reputation. In fact, there are lots of great companies like Microsoft where there’s a parallel track for Software Development Engineers in Test (SDET). Click testing has long since become automated unit tests which are now automated click & API tests against the staging server.

Operations and QA are the formalizations of, “Eek! Things are broken.” If you have a solid QA team checking things in test before you deploy, there are far fewer surprise outages. If you have an Operations team, they design and build things mindfully, considering risk and performance, rather than simply installing and hoping things work right.

At its core, DevOps and Operations are about getting servers or containers to meet the “three R requirements”:

  • Reliable: stays up or fails over to something else gracefully
  • Replaceable: you can start a new instance of the server with no special steps
  • Routine: server provisioning and decommissioning should be so easy that you can create a web form to do it

To me, that also sounds a lot like QA.

DevOps means if something broke and woke you up, you are empowered to write the test that ensures it never makes it to production again — you’re already the best part of QA.

As you get better at preventing downtime or outages and streamlining requests, you can scale volume more easily because you’re not responding to one-off requests. Think about the difference between manually resetting user logins and offering an automated system to do it. You may spend the same amount of time fixing user login problems, but for ten to twenty times as many users.

NOC as point to all of tech

One of my favorite NOCs I’ve visited is a telecommunications company in Los Angeles — it’s a classical NOC with an unconventional feel. Starting from the massive wall of dashboards, the room is arranged in rows, with each row representing a promotion in their operations org.  Promotions average 6-12 months apart, with clear milestones and can stop with being in the back row (as a defacto SRE) or into other parts of the org. With so many companies lamenting how hard it is to find talent these days, I expect this will become more common.

At PagerDuty we treat our support team in much the same way: employees in our support org have gone on not only to be managers or more technical roles inside that org, but also to the engineering, marketing, and sales teams and I don’t see any sign of that stopping (unsurprisingly, this makes it easier for us to hire great people)

Change isn’t always bad, but it always comes

Predictions are hard, especially about the future; but it’s clear that the future of the NOC will not be humans watching screens waiting to press buttons. For many classes of always-on applications, it will still make sense to keep people ready to jump into action — the question is what to do with the other 99% of their time.

The NOC has undergone quite a bit of change in recent years and will continue to do so. Those that adapt to the changing digital landscape will position themselves for success, and we look forward to navigating that transition with you.

The post The Future of the NOC appeared first on PagerDuty.

Read the original blog entry...

More Stories By PagerDuty Blog

PagerDuty’s operations performance platform helps companies increase reliability. By connecting people, systems and data in a single view, PagerDuty delivers visibility and actionable intelligence across global operations for effective incident resolution management. PagerDuty has over 100 platform partners, and is trusted by Fortune 500 companies and startups alike, including Microsoft, National Instruments, Electronic Arts, Adobe, Rackspace, Etsy, Square and Github.

Latest Stories
Widespread fragmentation is stalling the growth of the IIoT and making it difficult for partners to work together. The number of software platforms, apps, hardware and connectivity standards is creating paralysis among businesses that are afraid of being locked into a solution. EdgeX Foundry is unifying the community around a common IoT edge framework and an ecosystem of interoperable components.
In his session at 21st Cloud Expo, Carl J. Levine, Senior Technical Evangelist for NS1, will objectively discuss how DNS is used to solve Digital Transformation challenges in large SaaS applications, CDNs, AdTech platforms, and other demanding use cases. Carl J. Levine is the Senior Technical Evangelist for NS1. A veteran of the Internet Infrastructure space, he has over a decade of experience with startups, networking protocols and Internet infrastructure, combined with the unique ability to it...
"MobiDev is a software development company and we do complex, custom software development for everybody from entrepreneurs to large enterprises," explained Alan Winters, U.S. Head of Business Development at MobiDev, in this SYS-CON.tv interview at 21st Cloud Expo, held Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA.
Vulnerability management is vital for large companies that need to secure containers across thousands of hosts, but many struggle to understand how exposed they are when they discover a new high security vulnerability. In his session at 21st Cloud Expo, John Morello, CTO of Twistlock, addressed this pressing concern by introducing the concept of the “Vulnerability Risk Tree API,” which brings all the data together in a simple REST endpoint, allowing companies to easily grasp the severity of the ...
Agile has finally jumped the technology shark, expanding outside the software world. Enterprises are now increasingly adopting Agile practices across their organizations in order to successfully navigate the disruptive waters that threaten to drown them. In our quest for establishing change as a core competency in our organizations, this business-centric notion of Agile is an essential component of Agile Digital Transformation. In the years since the publication of the Agile Manifesto, the conn...
In his session at 21st Cloud Expo, James Henry, Co-CEO/CTO of Calgary Scientific Inc., introduced you to the challenges, solutions and benefits of training AI systems to solve visual problems with an emphasis on improving AIs with continuous training in the field. He explored applications in several industries and discussed technologies that allow the deployment of advanced visualization solutions to the cloud.
Enterprises are adopting Kubernetes to accelerate the development and the delivery of cloud-native applications. However, sharing a Kubernetes cluster between members of the same team can be challenging. And, sharing clusters across multiple teams is even harder. Kubernetes offers several constructs to help implement segmentation and isolation. However, these primitives can be complex to understand and apply. As a result, it’s becoming common for enterprises to end up with several clusters. Thi...
While some developers care passionately about how data centers and clouds are architected, for most, it is only the end result that matters. To the majority of companies, technology exists to solve a business problem, and only delivers value when it is solving that problem. 2017 brings the mainstream adoption of containers for production workloads. In his session at 21st Cloud Expo, Ben McCormack, VP of Operations at Evernote, discussed how data centers of the future will be managed, how the p...
"NetApp is known as a data management leader but we do a lot more than just data management on-prem with the data centers of our customers. We're also big in the hybrid cloud," explained Wes Talbert, Principal Architect at NetApp, in this SYS-CON.tv interview at 21st Cloud Expo, held Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA.
"Space Monkey by Vivent Smart Home is a product that is a distributed cloud-based edge storage network. Vivent Smart Home, our parent company, is a smart home provider that places a lot of hard drives across homes in North America," explained JT Olds, Director of Engineering, and Brandon Crowfeather, Product Manager, at Vivint Smart Home, in this SYS-CON.tv interview at @ThingsExpo, held Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA.
The question before companies today is not whether to become intelligent, it’s a question of how and how fast. The key is to adopt and deploy an intelligent application strategy while simultaneously preparing to scale that intelligence. In her session at 21st Cloud Expo, Sangeeta Chakraborty, Chief Customer Officer at Ayasdi, provided a tactical framework to become a truly intelligent enterprise, including how to identify the right applications for AI, how to build a Center of Excellence to oper...
"IBM is really all in on blockchain. We take a look at sort of the history of blockchain ledger technologies. It started out with bitcoin, Ethereum, and IBM evaluated these particular blockchain technologies and found they were anonymous and permissionless and that many companies were looking for permissioned blockchain," stated René Bostic, Technical VP of the IBM Cloud Unit in North America, in this SYS-CON.tv interview at 21st Cloud Expo, held Oct 31 – Nov 2, 2017, at the Santa Clara Conventi...
"Akvelon is a software development company and we also provide consultancy services to folks who are looking to scale or accelerate their engineering roadmaps," explained Jeremiah Mothersell, Marketing Manager at Akvelon, in this SYS-CON.tv interview at 21st Cloud Expo, held Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA.
Coca-Cola’s Google powered digital signage system lays the groundwork for a more valuable connection between Coke and its customers. Digital signs pair software with high-resolution displays so that a message can be changed instantly based on what the operator wants to communicate or sell. In their Day 3 Keynote at 21st Cloud Expo, Greg Chambers, Global Group Director, Digital Innovation, Coca-Cola, and Vidya Nagarajan, a Senior Product Manager at Google, discussed how from store operations and ...
"Infoblox does DNS, DHCP and IP address management for not only enterprise networks but cloud networks as well. Customers are looking for a single platform that can extend not only in their private enterprise environment but private cloud, public cloud, tracking all the IP space and everything that is going on in that environment," explained Steve Salo, Principal Systems Engineer at Infoblox, in this SYS-CON.tv interview at 21st Cloud Expo, held Oct 31 – Nov 2, 2017, at the Santa Clara Conventio...