Accedian is now part of Cisco  |

Troubleshooting Infrastructure and Application Performance with Accelerated MTTR/MTTI

IT operations professionals face a number of challenges when diagnosing and troubleshooting infrastructure and application performance issues, particularly in today’s complex IT environments, with their virtualization, cloud (public, private, hybrid) environments, and “as-a-Service” platforms (i.e., SaaS, IaaS, PaaS), let alone the increasing prevalence of “shadow IT”. With the clock ticking to achieve mean time to resolution (MTTR) objectives, a second dynamic—fingerpointing—is often at play. The mean time to innocence (MTTI) is, therefore, another practical if unofficial concern for any IT operations professional seeking to resolves issues quickly.

In their efforts to troubleshoot application performance degradations and to restore application performance in line with MTTR objectives, IT operations professionals such as the IT infrastructure manager, the DevOps manager, or the data center manager need to:

  • Detect application performance degradations as soon as they occur, preferably even before the end user experience (EuE) is impacted and users report it
  • Measure the impact of application performance degradations, qualifying the nature, scope, and timing of the degradation, while quantifying its impact on business-critical activities
  • Identify the root cause of a performance degradation by pinpointing the specific application host, database server, web service, or device that is affected, as well as the reasons for its occurrence
  • Attribute ownership of the performance degradation within the IT operations team or identify the appropriate owner in the Network team when it is not the fault of the IT infrastructure (thus establishing innocence for the IT operations team)
  • Resolve the application performance degradation by prioritizing its resolution, assigning responsibility, and sharing the necessary evidence such as application conversations immediately prior to the degradation
  • Validate that application performance levels have been restored according to internal key performance indicators (KPIs) such as availability or end user response time (EURT), service level agreement (SLA) targets in the case of SaaS platforms, or general historical performance baselines

Effective application performance troubleshooting and resolution

The IT operations professional must act quickly to restore application performance to expected levels while meeting MTTR objectives. To do so requires an application performance monitoring and management tool that enables IT operations professionals to:

  • Become aware of application performance degradations as soon as they occur, ideally before the point that they noticeably impact the end user experience (EuE)
  • Visualize the impact of performance degradations across all application chains in even the most complex network and IT environments, including hybrid clouds and Software-as-a-Service (SaaS) solutions
  • Pinpoint rapidly the specific host, database, service, or device that is affected by a performance degradation to move forward toward resolution
  • Identify the reasons for the performance degradation, with the ability to look back in time to before it occurred to understand normal levels of application performance as well as its impact on business-critical operations
  • Attribute ownership of the performance degradation and quickly share the supporting evidence with the appropriate IT operations or network team to achieve MTTR targets (as well as MTTI ones)

The well-designed application performance management (APM) solution

A well-designed application performance management (APM) solution supports application performance troubleshooting and resolution best practices with the ability to visualize and measure all aspects of application performance while providing the answers necessary to bring a rapid resolution to any performance degradation on any application chain. The well-designed APM solution will offer:

  • Proactive alerting of application performance degradations based on pre-determined performance thresholds (KPIs) or SLA targets, enabling early identification
  • The ability to view the performance of the entire application ecosystem across sites, data centers, the open internet, and cloud environments (public, private, hybrid) from a single pane of glass
  • The ability to drill down—with as few mouse clicks as possible—to the specific host, database, or service that is affected as well as inspect individual application conversations, including HTTPS connections, VoIP calls, SQL queries, or CIFS/SMB file shares
  • The ability to share evidence of the application performance degradation, including in the moments leading up to it, and to compare the performance of the affected host, database, or service over time with historical benchmarks
  • The ability to confirm that an application performance degradation has been resolved by demonstrating its return to historical or target baselines, enabling rapid closure of the trouble ticket
  • Visibility into “shadow IT” and its impact on overall application and network performance within the organization

The link between network performance and application performance troubleshooting

An application performance management solution (APM solution) that includes network performance management (NPM) capabilities provides IT operations professionals with the insight and understanding that a dedicated APM solution simply can’t deliver. With its ability to identify and understand individual application conversations as well as network flows, an integrated NAPM solution provides the context that goes far beyond the typical reports and graphs of an APM product or a dedicated application performance troubleshooting tool. An integrated NAPM solution will demonstrate how a performance degradation at the network level affects the organization’s ability to deliver on critical business outcomes.

An integrated NAPM solution will:

  • Provide a wide-angle view of the entire network and IT infrastructure, including physical, virtualized, and software-defined networks (SDN), as well as application chains, cloud (public, private, hybrid) environments, and “as-a-Service” platforms (Saas, PaaS, IaaS)
  • Capture and measure the performance of all application and network traffic between hosts, sites, and data centers (“north-south” traffic) as well as traffic within data centers and clouds (“east-west” traffic)
  • Enable the establishment of baselines for normal performance of individual applications and networks, as well as peaks and seasonality
  • Provide the ability to pinpoint the root cause of a performance degradation by drilling down to the specific component of the application chain (application server, database, device), or the network, and viewing the individual network flows and application transactions
  • Allow for sharing of evidence of the circumstances leading up to, and the effects afterwards, of an application performance degradation, including supporting application conversations and network flows

As a result, an integrated NAPM solution is a particularly effective tool for diagnosing application performance degradations, identifying their source while providing evidence to support resolution efforts, and reducing MTTR. Skylight is one such example of an integrated NAPM solution for enterprise application performance troubleshooting and resolution.

Learn more about Skylight for enterprise application performance troubleshooting.