Blog Feed Post

Dynatrace makes life easy for OpenStack admins (EAP starting)

We’re thrilled to announce the Early Access Program for Dynatrace OpenStack integration! This blog post is the first in a two-part series that explores how Dynatrace supports the monitoring of OpenStack environments.

OpenStack has become quite popular in recent years. Organizations are increasingly opting to build public and private OpenStack cloud environments for their employees and customers. One reason for the rapid adoption of OpenStack is its vibrant user community, which has fueled OpenStack’s growth and spirit of innovation. By joining the OpenStack community you can contribute your ideas related to requirements definition as well as development. This gives you the power to actively shape the features of the next OpenStack release.

OpenStack is indeed powerful, but it’s also complex. As an OpenStack admin, you know perfectly well that there’s no such thing as a flawless OpenStack cloud deployment. Even more challenging is maintaining smooth operation once your OpenStack cloud is used in a production environment.

Troubleshooting performance issues

Regardless if you’re working with a public or private cloud, as an OpenStack administrator, you need to be able to contend with a range of challenges. The components that are most likely to present you with challenges are:

  • OpenStack services
  • Supporting technologies like HAproxy, RabbitMQ, and MySQL
  • Network

OpenStack troubleshooting can be complex and time-consuming. This is due to the elusive nature of many OpenStack issues—problems with one OpenStack service can manifest themselves as performance issues within other services. For example, when a user reports an issue with launching a new VM or attaching a Cinder volume, your first thought might be to look into the log files of your Nova and Cinder services. After combing through hundreds of megabytes of log data, you might learn however that the root cause of the issue resides within a different OpenStack service or supporting technology (for example, HAproxy, Rabbit MQ, MySQL).

Dynatrace has good news for you OpenStack admins out there. With Dynatrace OpenStack monitoring, you no longer need to spend hours troubleshooting elusive issues within your OpenStack cloud!

Dynatrace provides complete OpenStack monitoring

In contrast to conventional monitoring tools, which typically cover only a single monitoring domain, Dynatrace provides a complete monitoring solution. Dynatrace monitoring covers:

  • OpenStack services
  • Supporting technologies
  • Compute nodes and VMs
  • Log analysis

For each of these components, Dynatrace provides automated root-cause analysis to help you identify the sources of problems and resolve issues in a timely manner.

Analyze OpenStack performance

OpenStack pages provide a holistic overview of your entire OpenStack account (see example images below).

(1) See if key components like compute and controller nodes are healthy.

(2) Gain insight into environment dynamics by tracking how the number of running virtual machines evolves over time. An increasing trend may indicate the need for capacity adjustments. Crucial details regarding the number of VMs that have been spawned and their average launch times is also included. If you notice launch times going up, you may want to investigate the reasons why.

(3) The Events section provides details such as on which compute node each VM is launched and stopped.

(4) The Compute section shows you how well your compute nodes are performing, which virtual machines are currently running on those nodes, and how the VMs contribute to overall resource usage.

You can slice and dice your OpenStack monitoring data with filters—compute nodes and virtual machines can be filtered based on RegionSecurity group nameCompute node name, Availability zone, and more. Such filtering is particularly useful for tracking down elusive performance issues within large environments.

Smartscape analysis (see below) shows you how your VMs interact with one another and gives you an understanding of the vertical dependencies between your application components—virtual machines, processes, and services.

Performance analysis of OpenStack services

Let’s explore Dynatrace’s automated problem detection and root-cause analysis capabilities with a Keystone use case. In the example below, the Keystone service began to respond slowly to TCP requests due to memory saturation on one of the controller nodes. Dynatrace has automatically identified the underlying root cause of this issue and the impact of the problem.

Let’s drill down into the Keystone metrics to better understand what’s going on here. Click the Keystone process tile to analyze this process within the context of the detected performance problem.

Here on the Keystone process page we see that the response time of the Keystone service has increased significantly, from 200 ms to 2 s.

By clicking the View all log entries button, you can explore all of the log data that’s been generated by this process.

The Log viewer has uncovered numerous warnings within the Keystone.log file indicating that the authentication process has been failing.

Now let’s take a look at the controller node that caused the issue. As you can see below, memory was indeed exhausted; it reached almost 100% saturation.

Note further down in the Processes section that all OpenStack services running on the controller are listed. Click any of these individual processes to analyze their connections and understand their relationship to other processes.

Dynatrace reports an outage event when Keystone becomes completely unavailable (see below). Outages are a major concern because they prevent users from performing any operations (each API request requires a Keystone token).

Out-of-the-box, Dynatrace automatically monitors your OpenStack environment for a wide range of potential log-based problem patterns. Dynatrace additionally detects when an OpenStack service can’t connect to a database or fails to authenticate.

Monitoring supporting technologies

Another potential problem area that OpenStack admins need to keep an eye on is the technologies that are frequently deployed alongside OpenStack. This includes load balancers (e.g., HAproxy), message brokers (e.g., RabbitMQ), and databases (e.g., MySQL).

To illustrate the challenges involved in monitoring the technologies that support OpenStack, here’s a problem we ran into within our own OpenStack environment. The RabbitMQ process (see below) was launched using the default file descriptor limit of 1024. When this limit was reached, RabbitMQ stopped accepting new connections. This issue resulted in a Connectivity problem (see below).

We wouldn’t have known about this problem if it weren’t for the RabbitMQ-specific counters that Dynatrace provides. All of this detail is included in the same view, so you don’t need to use multiple tools to get the full picture.

OpenStack dashboard tiles

Dynatrace provides two different OpenStack tiles that you can add to your home dashboard. The Regions tile displays relevant statistics related to the health of OpenStack services such as Keystone, Nova, compute nodes, virtual machines, and more. The Project tile provides insights into resource usage, taking assigned quotas into consideration. This information enables you to think proactively about resource usage issues related to critical projects, providing you with early warning of any resource capacity issues that may present themselves.

To add an OpenStack tile to your home dashboard

  1. Click the Home dashboard button in the upper-left corner.
  2. Click the Browse (…) button in the upper-right corner.
  3. Click Add tile.
  4. Select the Infrastructure filter in the left-hand navigation menu.
  5. Select the All regions tile or the Project tile.

Stay tuned for part two of this blog post series, to be published shortly. Part two will cover full-stack monitoring of applications that run in OpenStack clouds.

The post Dynatrace makes life easy for OpenStack admins (EAP starting) appeared first on Dynatrace blog – monitoring redefined.

Read the original blog entry...

More Stories By Dynatrace Blog

Building a revolutionary approach to software performance monitoring takes an extraordinary team. With decades of combined experience and an impressive history of disruptive innovation, that’s exactly what we ruxit has.

Get to know ruxit, and get to know the future of data analytics.

Latest Stories
Microsoft Azure Container Services can be used for container deployment in a variety of ways including support for Orchestrators like Kubernetes, Docker Swarm and Mesos. However, the abstraction for app development that support application self-healing, scaling and so on may not be at the right level. Helm and Draft makes this a lot easier. In this primarily demo-driven session at @DevOpsSummit at 21st Cloud Expo, Raghavan "Rags" Srinivas, a Cloud Solutions Architect/Evangelist at Microsoft, wi...
The next XaaS is CICDaaS. Why? Because CICD saves developers a huge amount of time. CD is an especially great option for projects that require multiple and frequent contributions to be integrated. But… securing CICD best practices is an emerging, essential, yet little understood practice for DevOps teams and their Cloud Service Providers. The only way to get CICD to work in a highly secure environment takes collaboration, patience and persistence. Building CICD in the cloud requires rigorous ar...
Containers are rapidly finding their way into enterprise data centers, but change is difficult. How do enterprises transform their architecture with technologies like containers without losing the reliable components of their current solutions? In his session at @DevOpsSummit at 21st Cloud Expo, Tony Campbell, Director, Educational Services at CoreOS, will explore the challenges organizations are facing today as they move to containers and go over how Kubernetes applications can deploy with lega...
In his session at 21st Cloud Expo, Raju Shreewastava, founder of Big Data Trunk, will provide a fun and simple way to introduce Machine Leaning to anyone and everyone. Together we will solve a machine learning problem and find an easy way to be able to do machine learning without even coding. Raju Shreewastava is the founder of Big Data Trunk (www.BigDataTrunk.com), a Big Data Training and consulting firm with offices in the United States. He previously led the data warehouse/business intellige...
Today most companies are adopting or evaluating container technology - Docker in particular - to speed up application deployment, drive down cost, ease management and make application delivery more flexible overall. As with most new architectures, this dream takes significant work to become a reality. Even when you do get your application componentized enough and packaged properly, there are still challenges for DevOps teams to making the shift to continuous delivery and achieving that reducti...
We all know that end users experience the Internet primarily with mobile devices. From an app development perspective, we know that successfully responding to the needs of mobile customers depends on rapid DevOps – failing fast, in short, until the right solution evolves in your customers' relationship to your business. Whether you’re decomposing an SOA monolith, or developing a new application cloud natively, it’s not a question of using microservices – not doing so will be a path to eventual b...
In a recent survey, Sumo Logic surveyed 1,500 customers who employ cloud services such as Amazon Web Services (AWS), Microsoft Azure, and Google Cloud Platform (GCP). According to the survey, a quarter of the respondents have already deployed Docker containers and nearly as many (23 percent) are employing the AWS Lambda serverless computing framework. It’s clear: serverless is here to stay. The adoption does come with some needed changes, within both application development and operations. Tha...
As hybrid cloud becomes the de-facto standard mode of operation for most enterprises, new challenges arise on how to efficiently and economically share data across environments. In his session at 21st Cloud Expo, Dr. Allon Cohen, VP of Product at Elastifile, will explore new techniques and best practices that help enterprise IT benefit from the advantages of hybrid cloud environments by enabling data availability for both legacy enterprise and cloud-native mission critical applications. By rev...
In his Opening Keynote at 21st Cloud Expo, John Considine, General Manager of IBM Cloud Infrastructure, will lead you through the exciting evolution of the cloud. He'll look at this major disruption from the perspective of technology, business models, and what this means for enterprises of all sizes. John Considine is General Manager of Cloud Infrastructure Services at IBM. In that role he is responsible for leading IBM’s public cloud infrastructure including strategy, development, and offering ...
SYS-CON Events announced today that Ryobi Systems will exhibit at the Japan External Trade Organization (JETRO) Pavilion at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Ryobi Systems Co., Ltd., as an information service company, specialized in business support for local governments and medical industry. We are challenging to achive the precision farming with AI. For more information, visit http:...
Amazon is pursuing new markets and disrupting industries at an incredible pace. Almost every industry seems to be in its crosshairs. Companies and industries that once thought they were safe are now worried about being “Amazoned.”. The new watch word should be “Be afraid. Be very afraid.” In his session 21st Cloud Expo, Chris Kocher, a co-founder of Grey Heron, will address questions such as: What new areas is Amazon disrupting? How are they doing this? Where are they likely to go? What are th...
As you move to the cloud, your network should be efficient, secure, and easy to manage. An enterprise adopting a hybrid or public cloud needs systems and tools that provide: Agility: ability to deliver applications and services faster, even in complex hybrid environments Easier manageability: enable reliable connectivity with complete oversight as the data center network evolves Greater efficiency: eliminate wasted effort while reducing errors and optimize asset utilization Security: imple...
High-velocity engineering teams are applying not only continuous delivery processes, but also lessons in experimentation from established leaders like Amazon, Netflix, and Facebook. These companies have made experimentation a foundation for their release processes, allowing them to try out major feature releases and redesigns within smaller groups before making them broadly available. In his session at 21st Cloud Expo, Brian Lucas, Senior Staff Engineer at Optimizely, will discuss how by using...
In this strange new world where more and more power is drawn from business technology, companies are effectively straddling two paths on the road to innovation and transformation into digital enterprises. The first path is the heritage trail – with “legacy” technology forming the background. Here, extant technologies are transformed by core IT teams to provide more API-driven approaches. Legacy systems can restrict companies that are transitioning into digital enterprises. To truly become a lead...
Companies are harnessing data in ways we once associated with science fiction. Analysts have access to a plethora of visualization and reporting tools, but considering the vast amount of data businesses collect and limitations of CPUs, end users are forced to design their structures and systems with limitations. Until now. As the cloud toolkit to analyze data has evolved, GPUs have stepped in to massively parallel SQL, visualization and machine learning.