Troubleshooting / MTTR

Troubleshooting infrastructure performance and accelerating Mean Time To Resolution (MTTR) with Skylight

Troubleshooting IT infrastructure performance degradations and accelerating MTTR is particularly challenging in today’s complex networking and IT environments with their physical and virtualized servers and networks spread across remote sites, multiple data centers, cloud environments, and “as-a-Service” platforms.

Effectively troubleshooting infrastructure performance degradations and achieving MTTR objectives requires:

  • Detecting infrastructure performance degradations in real time
  • Quantifying their impact
  • Identifying the root cause
  • Attributing ownership
  • Sharing evidence of the impact and source
  • Resolving performance degradations
  • Validating that performance levels have been restored

Skylight and infrastructure performance troubleshooting

A network performance management (NPM) solution with integrated application performance management (APM) capabilities, Skylight enables IT Operations professionals to rapidly detect all types of infrastructure performance degradations, quantify their impact with a few mouse clicks, identify their root causes efficiently, and route them to the appropriate team for immediate resolution with supporting evidence.

Skylight:

  • Measures the performance of the entire IT infrastructure with zero impact (i.e., wire data metrics)
  • Detects infrastructure performance degradations in real time, before users report them
  • Quantifies the impact of infrastructure performance degradations in comparison with historical baselines
  • Determines the affected links, sites, hosts, users, devices, and services
  • Identifies the root cause, even in complex IT environments
  • Generates evidence in support of resolution efforts
  • Validates that performance has been restored to previous levels

Skylight provides a wide-angle view of the entire IT infrastructure with universal network and application coverage, including north-south traffic between remote sites and data centers over physical, virtualized and software-defined networks (SDNs), and east-west traffic inside data centers and cloud environments, enabling IT Operations professionals to detect and isolate infrastructure performance degradations wherever they occur.

Instead of requiring checking off items in an infrastructure troubleshooting checklist, Skylight simplifies infrastructure troubleshooting with real-time dashboards and the ability to drill down in 4 clicks at most. Skylight sensors provide complete IT infrastructure visibility and analytics with full-featured performance monitoring and analysis tools provided in an intuitive at-a-glance user interface.

Whether troubleshooting network latency, working to determine network response time issues or engaged in deep application slowness troubleshooting, Skylight is the NAPM solution that accelerates MTTR for the most complex performance degradations in the most demanding networking and IT environments.

Troubleshooting infrastructure performance degradations effectively

Measuring infrastructure performance with zero impact

  • Wide-angle view of the entire IT infrastructure with universal network and application coverage, including cloud environments (public, private, hybrid) and “as-a-Service” platforms (SaaS, IaaS, PaaS)
  • Wire data metrics with objective measurements of 100% of network and application traffic, at over 10 Gbps per capture appliance with support for 40 Gbps links
    • Physical, virtualized, and SDNs, including SD-WAN
    • Network services (DNS, NetBIOS, NTP, etc.)
    • VoIP (SIP, Skinny, MGCP) and Unified Communications (UC)
    • Citrix® sessions (XenApp and XenDesktop), HTTP/S sessions (including public and private SaaS applications), SQL database transactions, and CIFS/SMB file transfers
  • Support for flow-based analytics (e.g., NetFlow, sFlow, jFlow, IPFix, etc.) and packet-based analytics
  • Performance data stored for up to 365 days, enabling historical comparisons for establishing performance baselines and evaluating levels of performance degradations

Detecting infrastructure performance degradations in real time

  • Monitor performance across all network types and application chains from a single pane of glass
  • Thresholds for proactive alerting of degradations to business-critical applications (BCAs) and business-critical networks (BCNs)
    • Internal key performance indicators (KPIs), such as end user response time (EURT), or third-party Service Level Agreement (SLA) commitments
  • Real-time alerting when a critical application provides a degraded end user experience (EuE)
  • Detect performance degradations based on objective measures of infrastructure performance before users report them

Quantifying the impact of infrastructure performance degradations

  • View infrastructure performance relative to normal or historical levels, up to 365 days, or patterns such as peaks and seasonality
  • Pinpoint the specific networks, sites, hosts, or users affected
  • Understand immediately when the problem occurred, and which components contributed to the degradation: client, network, server processing, or data volumes
  • Instant-playback dashboards summarize infrastructure performance data, enabling analysts to understand:
    • “What happened exactly? Who was involved? Where did it take place? When did it occur?” And most important, “Why did it happen?”

Identifying the root cause of an infrastructure performance degradation

  • Drill down with one-click access to the devices, applications, databases, or services experiencing degraded performance
  • Instantly identify the origin of the response time degradations and rapidly assign their resolution to the appropriate IT team, shortening MTTR for end users
    • Remote desktop/application access: VDI, Terminal Server, and Citrix layers
    • Client applications: generating disconnections, slowing down transfers, generating rogue traffic, etc.
    • Servers: generating disconnections, slowing down transfers
    • Applications with excessive processing times or application errors
    • Network devices: improper configuration (routing, prioritization), slow performance (degraded latency, packet loss), congestion
    • Common network services, such as name resolution, authentication, etc.

Sharing evidence of infrastructure performance degradations

  • Play back individual application conversations and network transactions to view the events leading up to the performance degradation
  • Correlate performance degradations with specific user action(s) or events, planned or ad hoc, to determine the cause of the degradation
    • e.g., client-side upgrades, network device configuration changes, common services deployments, security policy updates, server configuration changes, or application upgrades
  • Generate and export reports with the exact network requests or application conversations that led to the performance degradation

Validating that infrastructure performance levels have been restored

  • Monitor the performance path affecting end user experience and get instant feedback on the impact of the steps taken by the relevant IT team to mitigate performance issues
  • Compare infrastructure performance with historical baselines to validate the effectiveness of the solution, close the case, and communicate efficiently