Welcome!

Blog Feed Post

Microservices Monitoring and Critical Incident ManagementHow Dynatrace and VictorOps Work Together

Wolfgang Beer, Technical Product Manager at Dynatrace, co-wrote this article.

Microservices can be game-changing if, as Martin Fowler says and Adam Drake explains, you have rapid provisioning, basic monitoring, and rapid deployment already in place. And when microservices meet containers, they can boost software engineering power to a whole new level. Together, they form architectures that act like living, breathing entities and are much more adaptable than in the past.

But an ensemble of microservices is far more complex to understand, let alone troubleshoot, when it comes to performance. Often hosted in modern cloud platforms such as AWS, Azure, or OpenStack, microservices are dynamically started and scaled depending on actual demands and traffic. As useful as this process is, managing availability, detecting errors, and identifying performance problems become especially demanding for DevOps teams.

These rapidly changing environments and dynamically scaling services mean that the right responders must be notified especially fast when things go wrong. And we need to separate out the critical, actionable alerts, versus shooting over a firehose full of noise.

Fortunately, Dynatrace and VictorOps have a few ideas for how to achieve this goal and give your DevOps teams some relief.

Dynatrace: full-stack monitoring with Artificial Intelligence

First, you need the right notifications. Dynatrace automatically detects all of those microservice dynamic infrastructure changes and learns how the entire service environment normally behaves. The system catches each individual transaction, from your application user action to your backend services and databases.

Then Dynatrace puts all that topological and transactional data into context and uses AI algorithms and analytics to detect the root-cause of complex incidents. What is interrelated? What are baselines versus anomalies that warrant alarms? Without that deep transactional and code-level visibility, it would be impossible for DevOps teams to pinpoint what’s causing errors, slowdowns or even outages.

The screenshot below shows how Dynatrace automatically identifies a CPU spike as the root-cause of web application slowdowns. The problem details card also shows the business impact the detected problem causes in terms of impacted real users that were using your web application in the moment of the problem and how many service calls into the backend were also affected.

https://victorops.com/wp-content/uploads/2017/06/Dynatrace-Screen-Shot-7... 768w, https://victorops.com/wp-content/uploads/2017/06/Dynatrace-Screen-Shot-1... 1024w, https://victorops.com/wp-content/uploads/2017/06/Dynatrace-Screen-Shot-8... 820w, https://victorops.com/wp-content/uploads/2017/06/Dynatrace-Screen-Shot-5... 510w" sizes="(max-width: 600px) 100vw, 600px" />

The attached ‘Visual resolution path’ shows the topological dependencies that were discovered while following the problem impacts.

Despite the fact that Dynatrace delivers such in-depth automated analysis about your environment, it’s mission critical to receive problem notifications through a reliable channel such as VictorOps.

Integrating Dynatrace with VictorOps adds more intelligence

Next, it’s time to add intelligent categorization, routing, and remediation instructions to the incoming notifications. Enter VictorOps. Whereas Dynatrace detects problems in real-time, VictorOps gives you the tools to create flexible on-call schedules and add intelligence to the incident lifecycle.

By integrating Dynatrace with VictorOps, you can now apply logic to help the right alerts get to the right people. Via the Incident Automation Engine, you can set up VictorOps to do things like:

  • Indicate the level of severity of each incoming notification, so you’re only alerted when something is critically wrong, separating the signal from the noise
  • Route the specific alert to the right responder so the expert closest to the problem can solve it faster
  • Deliver remediation steps alongside alerts, to assist with resolution

Together, Dynatrace and VictorOps speed time to resolution. The intelligence built into each system alleviates some of the stress, false alarms, and frequent burnout that DevOps and on-call teams experience.

Anonymous Dynatrace customers say this

“We have been using Dynatrace for over 5 years, and find it an indispensable tool during pre-release functional testing, pre-release load testing, and especially post-production troubleshooting of severity one issues. With a breadth of distributed platforms for key application environments, Dynatrace gives us near-real-time (within a matter of seconds) analysis of end-to-end transactions that are spread across multiple servers and multiple layers of the stack…”
(Source: Gartner peer insights)

“Dynatrace has been spectacular to work with. Technology-wise, we use it primarily for root-cause analysis and performance management from an infrastructure perspective, as opposed to APM. But we’re beginning to use it for more comprehensive APM now, and it’s proving very helpful. Relationship-wise, the Dynatrace team is one of the best I’ve worked with in my 20 years in IT. They view their customer relationship as a true partnership.” – IT Architect|
(Source: Gartner peer insights)

Bring more intelligence to microservices monitoring

Does this sound good to you? If you’re curious, take Dynatrace for a free 15-day test drive. See VictorOps in action. And if you already use both systems, follow these steps to install the VictorOps/Dynatrace integration. Then please give us feedback on your experience.

The post Microservices Monitoring and Critical Incident Management
How Dynatrace and VictorOps Work Together
appeared first on VictorOps.

Read the original blog entry...

More Stories By VictorOps Blog

VictorOps is making on-call suck less with the only collaborative alert management platform on the market.

With easy on-call scheduling management, a real-time incident timeline that gives you contextual relevance around your alerts and powerful reporting features that make post-mortems more effective, VictorOps helps your IT/DevOps team solve problems faster.

Latest Stories
SYS-CON Events announced today that SourceForge has been named “Media Sponsor” of SYS-CON's 21st International Cloud Expo, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. SourceForge is the largest, most trusted destination for Open Source Software development, collaboration, discovery and download on the web serving over 32 million viewers, 150 million downloads and over 460,000 active development projects each and every month.
There is a huge demand for responsive, real-time mobile and web experiences, but current architectural patterns do not easily accommodate applications that respond to events in real time. Common solutions using message queues or HTTP long-polling quickly lead to resiliency, scalability and development velocity challenges. In his session at 21st Cloud Expo, Ryland Degnan, a Senior Software Engineer on the Netflix Edge Platform team, will discuss how by leveraging a reactive stream-based protocol,...
Today most companies are adopting or evaluating container technology - Docker in particular - to speed up application deployment, drive down cost, ease management and make application delivery more flexible overall. As with most new architectures, this dream takes significant work to become a reality. Even when you do get your application componentized enough and packaged properly, there are still challenges for DevOps teams to making the shift to continuous delivery and achieving that reducti...
SYS-CON Events announced today that Daiya Industry will exhibit at the Japan External Trade Organization (JETRO) Pavilion at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Daiya Industry specializes in orthotic support systems and assistive devices with pneumatic artificial muscles in order to contribute to an extended healthy life expectancy. For more information, please visit https://www.daiyak...
SYS-CON Events announced today that Nihon Micron will exhibit at the Japan External Trade Organization (JETRO) Pavilion at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Nihon Micron Co., Ltd. strives for technological innovation to establish high-density, high-precision processing technology for providing printed circuit board and metal mount RFID tags used for communication devices. For more inf...
SYS-CON Events announced today that Massive Networks, that helps your business operate seamlessly with fast, reliable, and secure internet and network solutions, has been named "Exhibitor" of SYS-CON's 21st International Cloud Expo ®, which will take place on Oct 31 - Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. As a premier telecommunications provider, Massive Networks is headquartered out of Louisville, Colorado. With years of experience under their belt, their team of...
SYS-CON Events announced today that Suzuki Inc. will exhibit at the Japan External Trade Organization (JETRO) Pavilion at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Suzuki Inc. is a semiconductor-related business, including sales of consuming parts, parts repair, and maintenance for semiconductor manufacturing machines, etc. It is also a health care business providing experimental research for...
"Our strategy is to focus on the hyperscale providers - AWS, Azure, and Google. Over the last year we saw that a lot of developers need to learn how to do their job in the cloud and we see this DevOps movement that we are catering to with our content," stated Alessandro Fasan, Head of Global Sales at Cloud Academy, in this SYS-CON.tv interview at 20th Cloud Expo, held June 6-8, 2017, at the Javits Center in New York City, NY.
21st International Cloud Expo, taking place October 31 - November 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA, will feature technical sessions from a rock star conference faculty and the leading industry players in the world. Cloud computing is now being embraced by a majority of enterprises of all sizes. Yesterday's debate about public vs. private has transformed into the reality of hybrid cloud: a recent survey shows that 74% of enterprises have a hybrid cloud strategy. Me...
Enterprises are moving to the cloud faster than most of us in security expected. CIOs are going from 0 to 100 in cloud adoption and leaving security teams in the dust. Once cloud is part of an enterprise stack, it’s unclear who has responsibility for the protection of applications, services, and data. When cloud breaches occur, whether active compromise or a publicly accessible database, the blame must fall on both service providers and users. In his session at 21st Cloud Expo, Ben Johnson, C...
Many organizations adopt DevOps to reduce cycle times and deliver software faster; some take on DevOps to drive higher quality and better end-user experience; others look to DevOps for a clearer line-of-sight to customers to drive better business impacts. In truth, these three foundations go together. In this power panel at @DevOpsSummit 21st Cloud Expo, moderated by DevOps Conference Co-Chair Andi Mann, industry experts will discuss how leading organizations build application success from all...
SYS-CON Events announced today that mruby Forum will exhibit at the Japan External Trade Organization (JETRO) Pavilion at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. mruby is the lightweight implementation of the Ruby language. We introduce mruby and the mruby IoT framework that enhances development productivity. For more information, visit http://forum.mruby.org/.
Cloud-based disaster recovery is critical to any production environment and is a high priority for many enterprise organizations today. Nearly 40% of organizations have had to execute their BCDR plan due to a service disruption in the past two years. Zerto on IBM Cloud offer VMware and Microsoft customers simple, automated recovery of on-premise VMware and Microsoft workloads to IBM Cloud data centers.
Today traditional IT approaches leverage well-architected compute/networking domains to control what applications can access what data, and how. DevOps includes rapid application development/deployment leveraging concepts like containerization, third-party sourced applications and databases. Such applications need access to production data for its test and iteration cycles. Data Security? That sounds like a roadblock to DevOps vs. protecting the crown jewels to those in IT.
SYS-CON Events announced today that Cedexis will exhibit at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Cedexis is the leader in data-driven enterprise global traffic management. Whether optimizing traffic through datacenters, clouds, CDNs, or any combination, Cedexis solutions drive quality and cost-effectiveness.