|By Andreas Grabner||
|October 5, 2014 07:00 PM EDT||
In my role as technology evangelist I spend a lot of time helping organizations, big and small, make their IT systems better, faster and more resilient to faults in order to support their business operations and objectives. I always find it frustrating to "argue" with our competitors about what the best solution is. I honestly think that many APM tools on the market do a good job - each with advantages and disadvantages in certain use cases. There is no "one size fits all" - there is just a "this tool fits best for your APM Maturity Level" (not saying the others wouldn't do a good job).
A lot of the arguing in the APM space is about the fundamental approach to monitoring application transactions: monitor and capture ALL details vs. monitor and capture relevant details. Along with that come topics like "overhead impact", "scalability" and "data hording vs smart analytics".
Ultimately, you want to pick the right tool to solve your problems. As you have multiple tools to choose from let me - in my role as technology evangelist - highlight some of the use cases that our customers solve. As a technologist and a blogger, what I really care about is that the right technology is applied to the right problem. As such, I feel compelled to share what I have learned working with customers in the trenches. Hopefully, this will help you understand the technology and what problem it can solve in real life problems, and cut through the propaganda. Let me start with a few use cases today and follow up with some more in follow up blog posts.
Use Cases from Steven - A Performance Engineer
The first use cases are picked from Steven - whom I reached out to after I read his question on our APM Community Forum. His company decided to move from a competitor to our APM solution and I wondered why. In an email, he highlighted that he had some initial success with the tool, and had been able to solve a couple of low hanging problems. When they decided to start taking a strategic Continuous Delivery approach to software delivery, they realized that the current tool had certain shortcomings slowing their attempts to practice DevOps.
They identified the following key problems they need to solve and what they really required from an APM solution in order to get to where they are heading:
How a user got to a problem, and not just seeing the problem itself
- Every transaction, with all details they need, out-of-the-box
- Web request/response bytes, SQL bind values, exception details for every transaction
Number of transactions executed per user and tenant used for business and cost reporting
- Capture custom business context data for every transaction
- Business transactions based on "buried" context data as not every detail is in the URL
Eliminate homegrown tools which are costly to maintain
- Provide application as well as system and infrastructure monitoring
- Integrate with other tools such as JMeter, LoadRunner, Jenkins or HP Open View
Eliminate the need to make people look at other tools and data
- Foster collaboration across Architect, Dev, Test & Ops by using same data set
- Data must be shareable with a single click
Ability to extend to custom frameworks, systems and protocols
- Bring in custom metrics from external tools via Java Plugin infrastructure
- Follow transactions across any custom protocol or technologies outside Java & .NET
Full Automation to support Continuous Delivery
- Use Metrics provided by APM for every build artifact along the deployment pipeline to act as quality gateway
- Inform APM about new deployments to prevent false alerting
Replace traditional application logging
- Eliminated log files which saves I/O and storage
- Get the log messages captured in context of a transaction and the context of the user that triggered that log message
One solution for everything
- Not just performance monitoring but also business reporting as well as deep dive diagnostics
Active community forum
- Get answers right away
- Leverage extensions already provided by the community such as plugins for Jenkins, PagerDuty, ...
Let me give you some examples for Steven's use case so that you can better decide on whether that is relevant for you as well:
Every Transaction with All Details
dynaTrace was built from the ground up to support the full software lifecycle. We as Compuware APM/dynaTrace understood that we needed a technology that captures every transaction with all details for root cause diagnostics as well as proper business monitoring without falling into a sampling mode where you lose critical information for both business and root cause diagnostics. Most of our customers claim they see little to acceptable overhead in production yet capturing 100% transactions including method arguments, SQL Statements, Log Messages or Exceptions. The magic word in our case is our PurePath (see the YouTube video) & PureStack Technology which allows dynaTrace to do exactly that. One of the several visualization of the PurePath is the Transaction Flow which is a great way to understand how your transactions flow through the system - where your hotspots are (3rd party impact, custom code issues or impact of Garbage Collection) and where your architectural issues (e.g: too many web service calls, too many SQL executions):
Transaction Flow: One View that tells it all to Devs, Architects and Operations Teams
What if you don't capture all transactions but be "smart" and focus on capturing the problematic ones? While this approach allows you to find and fix the easy-to-find problems that can be analyzed by analyzing those transactions that fail or violate the average response-time based baseline, it falls short when it comes to problems that are caused by transactions that are not "outside the norm". One example here is a database deadlock we recently analyzed for a customer. The "smart" approach only highlighted the transaction that hit the deadlock but no information was captured for those transactions actually causing the deadlock with their data manipulations. Being able to see which transactions executed which UPDATE statements at the time leading up to the deadlock is required to solve this problem.
As companies - such as Steven's - are getting into a maturity level where they grow out of "smart" average response time-based analysis it is important to have the ability to look at everything and not just the average problem. As a follow up read the blog Why Averages Suck and Percentiles are great!
Capture Custom Business Context
What is Custom Business Context? The actual business function executed such as a "Create Claim", "Transfer Money," or the name of the user or tenant of your system. Why is this not as easy as it sounds? Because many applications just don't show the business function as part of the URL or provide the user name in a cookie. A great example was given in a webinar by NJM Insurance (New Jersey Manufacturing Insurance). They were using a third-party claim management software which was designed to "hide" everything behind a claimCenter.do URL. In their case they needed dynaTrace to analyze every single transaction and pick a method argument invoked in the business layer of their app to figure out which function in their system was actually executed. On top of that they also needed to know the user that executed that function because they needed to understand which insurance office and group of employees created how many claims as they needed this for their quarterly business reports. The following shows business reporting based on the user role where the user role gets captured from a method argument within the business logic of the application:
Business Reporting requires Business Context data for every Transaction
This was only possible because dynaTrace allows you to selectively capture business context in the context of every single executed transaction. Along the PurePath you will then see things like method arguments, return values, bind values, session variables, HTTP parameters or cookie values. All to be later used for your business reporting or targeted root cause diagnostics. Here is a follow up blog post that explains business transactions in more technical detail.
Tricky charts and visually deceptive graphs often make a case for the impact IT performance has on business. The debate isn't around the obvious; of course, IT performance metrics like website load time influence business metrics such as conversions and revenue. Rather, this presentation will explore various data analysis concepts to understand how, and how not to, assert such correlations. In his session at 20th Cloud Expo, Leo Vasiliou, Director of Web Performance Engineering at Catchpoint Sys...
Feb. 26, 2017 09:45 AM EST Reads: 1,923
In recent years, containers have taken the world by storm. Companies of all sizes and industries have realized the massive benefits of containers, such as unprecedented mobility, higher hardware utilization, and increased flexibility and agility; however, many containers today are non-persistent. Containers without persistence miss out on many benefits, and in many cases simply pass the responsibility of persistence onto other infrastructure, adding additional complexity.
Feb. 26, 2017 09:30 AM EST Reads: 1,423
It is one thing to build single industrial IoT applications, but what will it take to build the Smart Cities and truly society changing applications of the future? The technology won’t be the problem, it will be the number of parties that need to work together and be aligned in their motivation to succeed. In his Day 2 Keynote at @ThingsExpo, Henrik Kenani Dahlgren, Portfolio Marketing Manager at Ericsson, discussed how to plan to cooperate, partner, and form lasting all-star teams to change the...
Feb. 26, 2017 09:15 AM EST Reads: 5,398
SYS-CON Events announced today that Hitrons Solutions will exhibit at the 19th International Cloud Expo, which will take place on November 1–3, 2016, at the Santa Clara Convention Center in Santa Clara, CA. Hitrons Solutions Inc. is distributor in the North American market for unique products and services of small and medium-size businesses, including cloud services and solutions, SEO marketing platforms, and mobile applications.
Feb. 26, 2017 09:15 AM EST Reads: 315
Feb. 26, 2017 09:15 AM EST Reads: 1,495
Unsecured IoT devices were used to launch crippling DDOS attacks in October 2016, targeting services such as Twitter, Spotify, and GitHub. Subsequent testimony to Congress about potential attacks on office buildings, schools, and hospitals raised the possibility for the IoT to harm and even kill people. What should be done? Does the government need to intervene? This panel at @ThingExpo New York brings together leading IoT and security experts to discuss this very serious topic.
Feb. 26, 2017 09:15 AM EST Reads: 2,973
When it comes to cloud computing, the ability to turn massive amounts of compute cores on and off on demand sounds attractive to IT staff, who need to manage peaks and valleys in user activity. With cloud bursting, the majority of the data can stay on premises while tapping into compute from public cloud providers, reducing risk and minimizing need to move large files. In his session at 18th Cloud Expo, Scott Jeschonek, Director of Product Management at Avere Systems, discussed the IT and busine...
Feb. 26, 2017 09:00 AM EST Reads: 6,075
All clouds are not equal. To succeed in a DevOps context, organizations should plan to develop/deploy apps across a choice of on-premise and public clouds simultaneously depending on the business needs. This is where the concept of the Lean Cloud comes in - resting on the idea that you often need to relocate your app modules over their life cycles for both innovation and operational efficiency in the cloud. In his session at @DevOpsSummit at19th Cloud Expo, Valentin (Val) Bercovici, CTO of Soli...
Feb. 26, 2017 08:45 AM EST Reads: 1,307
“We're a global managed hosting provider. Our core customer set is a U.S.-based customer that is looking to go global,” explained Adam Rogers, Managing Director at ANEXIA, in this SYS-CON.tv interview at 18th Cloud Expo, held June 7-9, 2016, at the Javits Center in New York City, NY.
Feb. 26, 2017 08:45 AM EST Reads: 2,203
Feb. 26, 2017 08:30 AM EST Reads: 8,125
DevOps is often described as a combination of technology and culture. Without both, DevOps isn't complete. However, applying the culture to outdated technology is a recipe for disaster; as response times grow and connections between teams are delayed by technology, the culture will die. A Nutanix Enterprise Cloud has many benefits that provide the needed base for a true DevOps paradigm. In his general session at 20th Cloud Expo, Chris Brown, a Solutions Marketing Manager at Nutanix, will explore...
Feb. 26, 2017 08:30 AM EST Reads: 1,617
WebRTC services have already permeated corporate communications in the form of videoconferencing solutions. However, WebRTC has the potential of going beyond and catalyzing a new class of services providing more than calls with capabilities such as mass-scale real-time media broadcasting, enriched and augmented video, person-to-machine and machine-to-machine communications. In his session at @ThingsExpo, Luis Lopez, CEO of Kurento, introduced the technologies required for implementing these idea...
Feb. 26, 2017 08:00 AM EST Reads: 6,829
In an era of historic innovation fueled by unprecedented access to data and technology, the low cost and risk of entering new markets has leveled the playing field for business. Today, any ambitious innovator can easily introduce a new application or product that can reinvent business models and transform the client experience. In their Day 2 Keynote at 19th Cloud Expo, Mercer Rowe, IBM Vice President of Strategic Alliances, and Raejeanne Skillern, Intel Vice President of Data Center Group and G...
Feb. 26, 2017 06:00 AM EST Reads: 4,354
Financial Technology has become a topic of intense interest throughout the cloud developer and enterprise IT communities. Accordingly, attendees at the upcoming 20th Cloud Expo at the Javits Center in New York, June 6-8, 2017, will find fresh new content in a new track called FinTech.
Feb. 26, 2017 06:00 AM EST Reads: 6,330
@GonzalezCarmen has been ranked the Number One Influencer and @ThingsExpo has been named the Number One Brand in the “M2M 2016: Top 100 Influencers and Brands” by Onalytica. Onalytica analyzed tweets over the last 6 months mentioning the keywords M2M OR “Machine to Machine.” They then identified the top 100 most influential brands and individuals leading the discussion on Twitter.
Feb. 26, 2017 06:00 AM EST Reads: 5,831