How to mitigate performance risk in a data center migration?

We see more IT departments considering NPM/APM adoption when dealing with data center moves, mergers and consolidations. Data center migration may be driven by mergers, group data center consolidation, or simply migrating to a hosting center or a private cloud. Some IT departments may look at this technology a little late (to help them fix degradations/misconfigurations which appeared during the migration), but many of them employ best practices and leverage the capabilities of NPM and APM solutions to assess current performance, then prepare, plan, execute, and review their migration.

A data center migration takes time and means there will be an intermediate stage between the start and the end of the migration, which can last a certain number of weeks and months. During that period one portion of the systems will be in the initial data center and the other portion will be in the target data center. Managing the performance delivered to end users during this period is a challenge.

This article offers a short description of the main stages at which a performance management will help you mitigate the risks attached to major infrastructure changes such as a data center migration.

data center migration

Before the data center migration

1. Measure initial end-user performance

2. Collect performance data to figure out how to migrate systems

To establish the best plan (i.e., what to move and in which order) and to avoid future errors, downtime, and performance degradations during the migration, you need to collect the following data and make sure that it is accurate:

  • Establish a list of all critical application chains
  • Dependency mapping: understand the dependency between the different hosts (infrastructure services like DNS, storage, authentication servers,  and all the hosts of the different tiers of each application), identifying which IP addressing may also be hard-coded
  • Understand the network requirements between all these hosts (i.e., volume and bandwidth) and the performance profile of each type of communication (e.g., the respective impact of network, server processing and data transfers on the end-user response times)

3. Anticipate additional latency during the migration and its impact on the end-user experience

When considering a significant move such as a data center migration, you will certainly have to phase these operations in various steps or waves. This will drive you to a situation where some of the systems are in your original datacenter and some are already migrated. Obviously, both host groups will keep communicating and exchanging data over a larger distance and a higher network latency. This will have an impact on end-user response times. You need to anticipate how much this impact will be for end users. As an example, if this stage adds an additional 10ms of network latency, depending on what drives the average server processing time and the quantity of data which is exchanged, it may lead to acceptable or non-acceptable response times or an outage for time-sensitive applications.

Execute the migration

4. Keep performance under control during the move / migration

During the data center migration, performance slowdowns, incidents, and other complaints will be reported to your helpdesk. They may be more numerous than during a normal period and they may target existing problems as well as new issues. The migration operation will certainly be the #1 suspect! You need to be well-prepared and continuously monitor end-to-end response times and errors on the legacy and the target data centers to avoid:

  • any performance degradations resulting in business losses (productivity and / or revenue)
  • any delay in the execution of the migration
  • any lack of shared performance data leading to a lack of collaboration between IT departments (e.g., network, system, database, applications) when troubleshooting slowdown issues
  • any negative feedback on your team’s work

After the migration

5. Analyze and identify remaining errors / misconfigurations

Migrating large IT systems comes with a massive amount of complexity; whatever the level of preparation and resources you had, some items will be missed and may remain invisible. You have to conduct a review of all patterns corresponding to misconfigurations or inherited behaviors (e.g., hosts calling non-existing services, hosts and network segments; one-way communications and data flows to old addressing plans should also be tracked).

6. Measure and report on the end-to-end user performance after the migration

Once the migration is considered completed, you should measure the end user performance and report the variation (if possible improvement) which comes with the implementation of the new infrastructure.

Don’t forget to request a promotion for the good job you’ve done!