This is a story about database-driven application performance. It’s about how Accedian helped a major healthcare organization accelerate the diagnostic of slowdowns reported over many months by end users and helped it to initiate code optimization by its software vendor in no more than a few hours.
Cutting MTTR (Mean Time To Resolve) for performance degradations from weeks to hours
Classical software package generating user complaints for slow response times
This group of hospitals in question uses a dedicated application to handle the admission and discharge of patients; this application is based on a thick client hosted on a Terminal server, that connects to a Microsoft SQL database server to retrieve and update the patient data. This application is provided by a third-party software vendor.
Facing the limits of traditional troubleshooting methods
The infrastructure team received repeated reports from the helpdesk team indicating the occurrence of user complaints regarding response time slowdowns with the application. The team conducted multiple troubleshooting attempts to locate the existence of the degradation and its root cause:
- SNMP monitoring revealed no loss of availability or resource outage
- Database system monitoring showed normal values
- Wireshark analysis failed due to the intermittency of degradations (i.e., never at the right time at the right place) and the few resources available compared to the volume of data that was to be analyzed.
They tested different configurations on the terminal servers as well as on the database server. They tried to compare the behaviour accessing the database server through the TSE infrastructure or directly from the client to the database server; they also opened multiple tickets at the vendor’s helpdesk and got little response due to the lack of concrete facts provided. After a few months, using their existing diagnostic and monitoring tools (mostly composed of SNMP tools, Wireshark, and system based trackers), the healthcare organization conducted a trial of SkyLIGHT PVX.
7 EASY STEPS TO IDENTIFY THE PAIN POINT FOR THIS DATABASE DRIVEN APPLICATION PERFORMANCE
1.Understand how the application is structured
(see the chart hereabove)
2.Define and implement a traffic capture method
Use a SPAN port of the traffic sent from/to the TSE and the MS-SQL servers
3. Install the SkyLIGHT PVX virtual appliance
Install on a server and connect it to the SPAN port (see this tutorial to install the virtual appliance in 5 minutes and this link to download the evaluation appliance). Then let SkyLIGHT PVX capture traffic and analyze application transactions (a great time for a coffee or a quick lunch)
4. Check the network layer
Here SkyLIGHT PVX indicates no variation in network latency and no sign of significant packet loss of TCP retransmission.
5. Check the performance of the TSE farm
End-user experience for the Terminal server layer is stable and is not impacted by any network or server processing slowdown.
6. Check the overall Database Transaction Performance for all MS-SQL users
The overall database transaction performance shows fast sever processing and a very limited number of SQL errors.
The trend chart displays no variation based on time: database performance looks stable overall.
7. Narrow down to the performance of individual SQL Transactions looking for:
A – QUERIES WITH SLOW SERVER PROCESSING
Show the transactions with the slowest server response times.
B- QUERIES WITH ABNORMAL FREQUENCY
Some queries may happen too often for no reason (or for poor design); they may have a negative impact on the load and performance of your SQL servers.
C- QUERIES WITH SLOW DATA TRANSFER
This screen shows the queries whose data transfers are the slowest: slow transfers are the combination of responses with large volumes of data and degraded transfer conditions (retranmissions, TCP errors – coming either from the network or the systems – client or server).
D- QUERIES SENDING ABNORMAL VOLUME OF DATA
In our case, neither the processing by the database server nor the transfer of data over the network was the cause of the slowdown. The actual root cause was poorly optimized SQL queries sending excessive data sets, which thick clients had difficulties processing within a reasonable delay.
Conclusion: Ensuring business users can make the most of their applications
Applications are built to help business operations get to their optimal productivity. When slowdowns occur on a regular basis, the healthcare organization’s overall productivity is hit, as well as the quality of service provided to patients.
By implementing capabilities to shorten the diagnostic time for response time degradations, this hospital is generating hundreds of thousands of euros of savings each month.