Welcome!

Blog Feed Post

The Diverse Alerting Needs for Application Performance Monitoring

Alerting Needs for Application Performance MonitoringIn today’s digital economy, most business services rely on IT applications. The increasing dependency on applications has resulted in the growing adoption of application performance monitoring (APM) solutions. The goals of an APM solution are:

  • To ensure high application uptime, service reliability and great end-user experience
  • To proactively diagnose performance problems so the respective stakeholder (application owner, IT Ops, DevOps, developer, etc.) can fix them before users notice them.

Modern APM solutions must not only have deep monitoring functionality, but they must also be able to provide actionable intelligence to simplify an administrator’s job in finding and fixing an application problem. Alerts on performance deviations, errors, warnings, bottlenecks, etc. are essential requirements for an APM tool. But the requirements of enterprise IT teams have expanded beyond this to also include context-aware alerting for fast and smart resolution of problems.

  • Application Performance Monitoring Alerting RequirementsBusiness-aware alerting: As IT infrastructures have evolved to be multi-tiered with inter-dependencies between tiers, monitoring in silos is no longer sufficient. Service owners need to know when the service is impacted, and so an APM tool should embed the intelligence to discover application topologies, and these topologies should, in turn, provide an admin with business-aware, service-level alerts. To support this, when any application component failures are detected, the states of all services that depend on the affected components should reflect the performance problem. This way, when users contact the helpdesk to report problems with the service, the helpdesk staff can quickly determine whether the complaint relates to a known problem with the service or not.
  • Application Performance Monitoring - Correlated AlertsRoot cause alerting: Determining the root cause of an application slowdown is one of the most difficult tasks for IT operations teams. Again, application inter-dependencies and infrastructure inter-dependencies make root cause alerting very difficult. A problem in one tier can ripple and affect several others. For root cause alerting, APM tools must consider inter-application and application-to-infrastructure dependencies. For example, a web application may be slow because of slow query processing in the backend database. In turn, the database server may be running on a storage device where one of the RAID array disks has failed and is limiting the throughput the device can support. Therefore, the database queries issued by the application are taking extra time. In this scenario, an APM tool should highlight the root cause (i.e., the storage device issue), and indicate all effects (i.e., database server slowness and application slowness). Accurate root cause alerting results in improved user satisfaction and higher service uptime. It also enables IT operations staff to spend less time fire-fighting problems, and enhances operations productivity.

Root Cause Alert

  • APM Solutions - Not All Alerts Are the SameAggregated alerting on farm-wide metrics: Large infrastructures have many servers in a farm/cluster. An administrator may only need to be alerted when, for example, four out of six web servers are facing connection spikes. This provides the actionable intelligence that is needed to determine when additional servers should be added to support the growing connection load. More complex conditions across multiple servers should also be supported. For example, an administrator may want to be alerted when 25% of servers are reporting CPU utilization above 80%. Such farm-wide alerts help administrators understand the health and capacity requirements of the entire farm (rather than just individual servers/nodes).
  • Alerting in Application Performance MonitoringComposite Alerting: Management-level reports must present simplified views of performance, instead of only detailed metrics. For example, consider a CIO who is interested in knowing if the user experience of a core virtual desktop service is good or not. There are many factors that affect the user experience, include a user’s logon time, application launch time, screen refresh latency, bandwidth availability etc. While an IT operations person is interested in the details, the CIO is not. The CIO is only looking for the overall user experience. APM tools must offer composite alerting functionality to simplify executive-level reporting. A composite alert is the collective representation of the state of multiple metrics. By assigning weights for different metrics (e.g. for logons, which happen less frequently and so may have a lower weight than screen refresh latency) and using a weighted average method, a composite rating is obtained –  a simplified percentage value indicating user experience. Examples of composite alerts include user experience, Apdex score, stress level for servers, etc.

Composite Alert

  • Alerting Requirements in APM SolutionsSituation-aware dynamic baseline alerting: Manually adjusting alert thresholds for every performance metric is challenging. Based on usage trends, there is a need for different alert thresholds at different times of the day and for each day of the week. An admin would not need an alert triggered for the same threshold condition during the day – when there is high workload on an application server – as during low workload time at night or over the weekend. The best practice to determine these alert thresholds is by baselining the application and infrastructure performance. Some APM tools use artificial intelligence and machine learning to auto-baseline the infrastructure and dynamically determine alert thresholds. This is critical, as unless there is situational-awareness built into the APM solution, there would certainly be false positives for administrators, making their job more difficult.

Baseline Alert

eG Enterprise is an end-to-end application performance monitoring solution that includes all these comprehensive, intelligent alerting capabilities that help IT and business stakeholders get actionable insights for effective troubleshooting and decision-making. With out-of-the-box monitoring support for over 180 applications (Java, SAP, SharePoint, Citrix, PeopleSoft, etc.), eG Enterprise tracks health, availability and performance of all aspects of your business-critical applications and helps with proactive problem diagnosis and root cause analysis.

Learn more about APM with eG Enterprise »

 

The post The Diverse Alerting Needs for Application Performance Monitoring appeared first on eG Innovations.

Read the original blog entry...

More Stories By Vinod Mohan

Vinod Mohan is a Senior Product Marketing Manager for eG Innovations, a global provider of unified performance monitoring and root-cause diagnosis solutions for virtual, physical and cloud IT infrastructures. He has 10 years of experience in product, technology and solution marketing of IT software and services spanning application performance management, network, systems, virtualization, storage, IT security and IT service management (ITSM).

Previously, he was a Senior Product Marketing Manager at SolarWinds for server and application monitoring software. Now a key team member for eG Innovations, he is a contributing author for the eG Innovations blog, "Application & Virtualization Performance Insights", along with other trade publications including APMdigest, DABCC, Cyber Defense Magazine, IT Briefcase, Insfosec Island, The Hacker News, IT Pro Portal, and SolarWinds THWACK community.

Latest Stories
Automation is enabling enterprises to design, deploy, and manage more complex, hybrid cloud environments. Yet the people who manage these environments must be trained in and understanding these environments better than ever before. A new era of analytics and cognitive computing is adding intelligence, but also more complexity, to these cloud environments. How smart is your cloud? How smart should it be? In this power panel at 20th Cloud Expo, moderated by Conference Chair Roger Strukhoff, paneli...
In his session at @ThingsExpo, Eric Lachapelle, CEO of the Professional Evaluation and Certification Board (PECB), provided an overview of various initiatives to certify the security of connected devices and future trends in ensuring public trust of IoT. Eric Lachapelle is the Chief Executive Officer of the Professional Evaluation and Certification Board (PECB), an international certification body. His role is to help companies and individuals to achieve professional, accredited and worldwide re...
Today we can collect lots and lots of performance data. We build beautiful dashboards and even have fancy query languages to access and transform the data. Still performance data is a secret language only a couple of people understand. The more business becomes digital the more stakeholders are interested in this data including how it relates to business. Some of these people have never used a monitoring tool before. They have a question on their mind like “How is my application doing” but no id...
IoT solutions exploit operational data generated by Internet-connected smart “things” for the purpose of gaining operational insight and producing “better outcomes” (for example, create new business models, eliminate unscheduled maintenance, etc.). The explosive proliferation of IoT solutions will result in an exponential growth in the volume of IoT data, precipitating significant Information Governance issues: who owns the IoT data, what are the rights/duties of IoT solutions adopters towards t...
With the introduction of IoT and Smart Living in every aspect of our lives, one question has become relevant: What are the security implications? To answer this, first we have to look and explore the security models of the technologies that IoT is founded upon. In his session at @ThingsExpo, Nevi Kaja, a Research Engineer at Ford Motor Company, discussed some of the security challenges of the IoT infrastructure and related how these aspects impact Smart Living. The material was delivered interac...
The current age of digital transformation means that IT organizations must adapt their toolset to cover all digital experiences, beyond just the end users’. Today’s businesses can no longer focus solely on the digital interactions they manage with employees or customers; they must now contend with non-traditional factors. Whether it's the power of brand to make or break a company, the need to monitor across all locations 24/7, or the ability to proactively resolve issues, companies must adapt to...
Wooed by the promise of faster innovation, lower TCO, and greater agility, businesses of every shape and size have embraced the cloud at every layer of the IT stack – from apps to file sharing to infrastructure. The typical organization currently uses more than a dozen sanctioned cloud apps and will shift more than half of all workloads to the cloud by 2018. Such cloud investments have delivered measurable benefits. But they’ve also resulted in some unintended side-effects: complexity and risk. ...
It is ironic, but perhaps not unexpected, that many organizations who want the benefits of using an Agile approach to deliver software use a waterfall approach to adopting Agile practices: they form plans, they set milestones, and they measure progress by how many teams they have engaged. Old habits die hard, but like most waterfall software projects, most waterfall-style Agile adoption efforts fail to produce the results desired. The problem is that to get the results they want, they have to ch...
With major technology companies and startups seriously embracing Cloud strategies, now is the perfect time to attend 21st Cloud Expo October 31 - November 2, 2017, at the Santa Clara Convention Center, CA, and June 12-14, 2018, at the Javits Center in New York City, NY, and learn what is going on, contribute to the discussions, and ensure that your enterprise is on the right path to Digital Transformation.
In 2014, Amazon announced a new form of compute called Lambda. We didn't know it at the time, but this represented a fundamental shift in what we expect from cloud computing. Now, all of the major cloud computing vendors want to take part in this disruptive technology. In his session at 20th Cloud Expo, Doug Vanderweide, an instructor at Linux Academy, discussed why major players like AWS, Microsoft Azure, IBM Bluemix, and Google Cloud Platform are all trying to sidestep VMs and containers wit...
The taxi industry never saw Uber coming. Startups are a threat to incumbents like never before, and a major enabler for startups is that they are instantly “cloud ready.” If innovation moves at the pace of IT, then your company is in trouble. Why? Because your data center will not keep up with frenetic pace AWS, Microsoft and Google are rolling out new capabilities. In his session at 20th Cloud Expo, Don Browning, VP of Cloud Architecture at Turner, posited that disruption is inevitable for comp...
While DevOps most critically and famously fosters collaboration, communication, and integration through cultural change, culture is more of an output than an input. In order to actively drive cultural evolution, organizations must make substantial organizational and process changes, and adopt new technologies, to encourage a DevOps culture. Moderated by Andi Mann, panelists discussed how to balance these three pillars of DevOps, where to focus attention (and resources), where organizations might...
No hype cycles or predictions of zillions of things here. IoT is big. You get it. You know your business and have great ideas for a business transformation strategy. What comes next? Time to make it happen. In his session at @ThingsExpo, Jay Mason, Associate Partner at M&S Consulting, presented a step-by-step plan to develop your technology implementation strategy. He discussed the evaluation of communication standards and IoT messaging protocols, data analytics considerations, edge-to-cloud tec...
When growing capacity and power in the data center, the architectural trade-offs between server scale-up vs. scale-out continue to be debated. Both approaches are valid: scale-out adds multiple, smaller servers running in a distributed computing model, while scale-up adds fewer, more powerful servers that are capable of running larger workloads. It’s worth noting that there are additional, unique advantages that scale-up architectures offer. One big advantage is large memory and compute capacity...
New competitors, disruptive technologies, and growing expectations are pushing every business to both adopt and deliver new digital services. This ‘Digital Transformation’ demands rapid delivery and continuous iteration of new competitive services via multiple channels, which in turn demands new service delivery techniques – including DevOps. In this power panel at @DevOpsSummit 20th Cloud Expo, moderated by DevOps Conference Co-Chair Andi Mann, panelists examined how DevOps helps to meet the de...