Blog Feed Post

Incident Management for Regulated Industries

Being on-call is already a demanding and sometimes very unforgiving responsibility. If you are working in a regulated industry, however, the demands that incident management places on your organization are likely to be even greater and even less forgiving. In this article, we’ll discuss some of the basic principles of software-related incident management in regulated industries.

Incidents, Regulations, and Compliance

First, however, let’s take a quick look at what a software-related incident means in a regulated industry. If you were to ask most people in software development or IT to define “incidents”, they may talk about them in terms of downtime or poor application response time. Another important factor could be security — break-ins, data theft, failure to protect sensitive data, etc. 

But in regulated industries, the term “incident” has a scope that goes far beyond downtime and security issues; it can be anything which places the organization or its products or services out of compliance with regulations. For a water company, that might be the presence of E. Coli bacteria in the water supply. For a bank, it could be the loss of customer financial data. For a hospital, it could be the failure of critical life-support systems. Incidents involving public safety, the loss of crucial data, or interruption of key services, when regulatory compliance is at stake, may be at least as important as those involving ordinary downtime.

Compliance — What’s at Stake

One of the most fundamental issues for any organization involved in a regulated industry is the need to stay in compliance with applicable regulations. Depending on the industry and the nature of the incident, being out of compliance can result in:

  • Fines, fees, or other civil or administrative penalties
  • Lawsuits or other legal action by organizations or individuals affected by the incident
  • Suspension or loss of licenses or other certification required to work in the industry
  • Loss of reputation within the industry or in the eyes of the general public
  • In extreme cases, criminal charges, conviction, and jail time for the responsible individuals

In other words, the stakes can be very high; you do not want to be in the position of explaining your incident management procedures to a judge.

Necessary and Best Practices

How do you manage incidents under such strict conditions? The best incident management is prevention — to take care of all potential incidents before they become compliance issues. That isn’t always possible under real-world conditions, so it is important to have incident-response plans which meet both legal requirements and practical necessities. To do this, it’s important to take into account the following factors:

  • Regulatory requirements and guidelines. Always follow regulatory agency requirements with regard to incident management, prevention, and response. These vary, depending on the industry and the agency involved, but they will often include a formal incident response plan, an IT incident response team, and formal documentation of incident response procedures and actions. 

Organizations operating under the Health Insurance Portability and Accountability Act (HIPAA) or the Payment Card Industry Data Security Standard (PCI DSS), for example, must have a documented security-response plan and a response team; the Federal Information Security Management Act (FISMA) likewise includes detailed incident management and response guidelines for federal agencies. Find out which agencies and which requirements your organization is subject to, if you do not already know, and make sure that you are in complete compliance with all requirements.

  • Industry guidelines and best practices. These also vary, depending on the industry. An industry-wide professional organization will often be able to provide a set of recommended practices.

If there are no specific guidelines for your industry, the Common Criteria and Common Evaluation Method documents provide a useful framework for understanding general IT security and public-safety issues.

General Considerations

There are some basic considerations which apply to all regulated industries and all regulatory frameworks:


Identify all sensitive systems (applications, networks, services, etc.) in which a failure or other malfunction could lead directly or indirectly to a compliance problem. A database containing client medical records, for example, or a program that manages the distribution of power for a public utility, is likely to fall under this heading. Your company’s bookkeeping software, as important as it may be, is probably not a sensitive system in this context.


Your first line of incident management defense is to prevent any of the systems which you have identified as sensitive from even approaching a state of failure. This means that your incident response team should be alerted not only for any failure in these systems, but for any condition which has the potential to lead to a failure. For security-sensitive systems, this might be any activity which suggests an attempted break-in, or any degradation in performance of the security software itself. For systems where public safety is at stake, this could include any anomalous behavior in any key metric. Needless to say, prevention includes full backups of data, and where necessary, full backup systems on standby.

Catching problems before they turn into regulatory compliance failures also requires an incident response team completely in sync, armed with full context from all information sources. In these situations, every second counts! For that reason, it’s vital to have responders defined ahead of time, clear escalation policies, and access to metrics from multiple systems pulled together into a unified view of the issue.


You will in effect need to add another level of priority to your existing incident management triage, giving all compliance-related incidents overriding priority. This means that if your bookkeeping and inventory systems both crash completely, and at the same time, your medical records database starts to act like it’s just a bit under the weather, your accounting staff and warehouse crew may need to stand around until your emergency response team takes care of the database if you don’t have enough IT people on hand to attend to everything. And if public safety is involved, your response team may need to be ready to keep crucial systems going in the immediate aftermath of a major disaster.

All of this may sound formidable, and expensive as well. But the cost of a major incident can be much higher, particularly if a regulatory agency or a judge determines that your company has failed to adequately comply with regulations. The bottom line for you and your company is that preventative incident management is by far the best protection you can have.

If you’re looking for a resource to improve your incident response processes and workflows, check out our open-sourced incident response documentation as well as our financial services solutions brief for an example of how PagerDuty helps regulated industries.

The post Incident Management for Regulated Industries appeared first on PagerDuty.

Read the original blog entry...

More Stories By PagerDuty Blog

PagerDuty’s operations performance platform helps companies increase reliability. By connecting people, systems and data in a single view, PagerDuty delivers visibility and actionable intelligence across global operations for effective incident resolution management. PagerDuty has over 100 platform partners, and is trusted by Fortune 500 companies and startups alike, including Microsoft, National Instruments, Electronic Arts, Adobe, Rackspace, Etsy, Square and Github.

Latest Stories
Dion Hinchcliffe is an internationally recognized digital expert, bestselling book author, frequent keynote speaker, analyst, futurist, and transformation expert based in Washington, DC. He is currently Chief Strategy Officer at the industry-leading digital strategy and online community solutions firm, 7Summits.
DXWorldEXPO LLC announced today that Dez Blanchfield joined the faculty of CloudEXPO's "10-Year Anniversary Event" which will take place on November 11-13, 2018 in New York City. Dez is a strategic leader in business and digital transformation with 25 years of experience in the IT and telecommunications industries developing strategies and implementing business initiatives. He has a breadth of expertise spanning technologies such as cloud computing, big data and analytics, cognitive computing, m...
Digital Transformation and Disruption, Amazon Style - What You Can Learn. Chris Kocher is a co-founder of Grey Heron, a management and strategic marketing consulting firm. He has 25+ years in both strategic and hands-on operating experience helping executives and investors build revenues and shareholder value. He has consulted with over 130 companies on innovating with new business models, product strategies and monetization. Chris has held management positions at HP and Symantec in addition to ...
Cloud-enabled transformation has evolved from cost saving measure to business innovation strategy -- one that combines the cloud with cognitive capabilities to drive market disruption. Learn how you can achieve the insight and agility you need to gain a competitive advantage. Industry-acclaimed CTO and cloud expert, Shankar Kalyana presents. Only the most exceptional IBMers are appointed with the rare distinction of IBM Fellow, the highest technical honor in the company. Shankar has also receive...
DXWorldEXPO LLC announced today that Kevin Jackson joined the faculty of CloudEXPO's "10-Year Anniversary Event" which will take place on November 11-13, 2018 in New York City. Kevin L. Jackson is a globally recognized cloud computing expert and Founder/Author of the award winning "Cloud Musings" blog. Mr. Jackson has also been recognized as a "Top 100 Cybersecurity Influencer and Brand" by Onalytica (2015), a Huffington Post "Top 100 Cloud Computing Experts on Twitter" (2013) and a "Top 50 C...
There is a huge demand for responsive, real-time mobile and web experiences, but current architectural patterns do not easily accommodate applications that respond to events in real time. Common solutions using message queues or HTTP long-polling quickly lead to resiliency, scalability and development velocity challenges. In his session at 21st Cloud Expo, Ryland Degnan, a Senior Software Engineer on the Netflix Edge Platform team, will discuss how by leveraging a reactive stream-based protocol,...
Enterprises have taken advantage of IoT to achieve important revenue and cost advantages. What is less apparent is how incumbent enterprises operating at scale have, following success with IoT, built analytic, operations management and software development capabilities - ranging from autonomous vehicles to manageable robotics installations. They have embraced these capabilities as if they were Silicon Valley startups.
Daniel Jones is CTO of EngineerBetter, helping enterprises deliver value faster. Previously he was an IT consultant, indie video games developer, head of web development in the finance sector, and an award-winning martial artist. Continuous Delivery makes it possible to exploit findings of cognitive psychology and neuroscience to increase the productivity and happiness of our teams.
The standardization of container runtimes and images has sparked the creation of an almost overwhelming number of new open source projects that build on and otherwise work with these specifications. Of course, there's Kubernetes, which orchestrates and manages collections of containers. It was one of the first and best-known examples of projects that make containers truly useful for production use. However, more recently, the container ecosystem has truly exploded. A service mesh like Istio addr...
Poor data quality and analytics drive down business value. In fact, Gartner estimated that the average financial impact of poor data quality on organizations is $9.7 million per year. But bad data is much more than a cost center. By eroding trust in information, analytics and the business decisions based on these, it is a serious impediment to digital transformation.
As DevOps methodologies expand their reach across the enterprise, organizations face the daunting challenge of adapting related cloud strategies to ensure optimal alignment, from managing complexity to ensuring proper governance. How can culture, automation, legacy apps and even budget be reexamined to enable this ongoing shift within the modern software factory? In her Day 2 Keynote at @DevOpsSummit at 21st Cloud Expo, Aruna Ravichandran, VP, DevOps Solutions Marketing, CA Technologies, was jo...
Predicting the future has never been more challenging - not because of the lack of data but because of the flood of ungoverned and risk laden information. Microsoft states that 2.5 exabytes of data are created every day. Expectations and reliance on data are being pushed to the limits, as demands around hybrid options continue to grow.
Business professionals no longer wonder if they'll migrate to the cloud; it's now a matter of when. The cloud environment has proved to be a major force in transitioning to an agile business model that enables quick decisions and fast implementation that solidify customer relationships. And when the cloud is combined with the power of cognitive computing, it drives innovation and transformation that achieves astounding competitive advantage.
Digital Transformation: Preparing Cloud & IoT Security for the Age of Artificial Intelligence. As automation and artificial intelligence (AI) power solution development and delivery, many businesses need to build backend cloud capabilities. Well-poised organizations, marketing smart devices with AI and BlockChain capabilities prepare to refine compliance and regulatory capabilities in 2018. Volumes of health, financial, technical and privacy data, along with tightening compliance requirements by...
As IoT continues to increase momentum, so does the associated risk. Secure Device Lifecycle Management (DLM) is ranked as one of the most important technology areas of IoT. Driving this trend is the realization that secure support for IoT devices provides companies the ability to deliver high-quality, reliable, secure offerings faster, create new revenue streams, and reduce support costs, all while building a competitive advantage in their markets. In this session, we will use customer use cases...