The emergence of the Public Cloud is breathing new life into an old problem; delivering a meaningful disaster recovery solution at an affordable price. Historically, organizations have had the choice of pursuing disaster recovery contracts with service providers or maintaining a secondary data center/colocation facility for disaster recovery purposes. Utilizing the Public Cloud for disaster recovery can eliminate the need to maintain equipment in a separate data center or colocation facility while providing greater flexibility and predictability than a disaster recovery contract with a service provider.
Hidden costs and challenges often accompany dedicated disaster recovery sites
Maintaining a separate data center or colocation facility to support disaster recovery can result in unexpected costs and operational challenges.
For disaster recovery plans to stay current, compute and storage capacity supporting recovered workloads must be kept in alignment with production. So, as production compute and storage platforms are expanded for growth and refresh, additional investments are often required at the disaster recovery location to ensure sufficient capacity is available to recover critical applications.
In addition to capacity alignment, firmware and software supporting the underlying compute and storage infrastructure platforms in disaster recovery locations must be kept current and patched, adding to the workload of an operational teams.
Many companies look to get the most out of their dedicated disaster recovery sites by utilizing idle disaster recovery compute and storage infrastructure to support development and testing environments. While this practice helps organizations to control costs, it adds additional complexity to the disaster recovery process which can make it difficult to maintain accurate and up-to-date recovery plans.
Recovery time objectives (RTOs) can be challenging to maintain with service provider disaster recovery contracts
Organizations that contract with a traditional disaster recovery service provider, may own and maintain connectivity, replication storage, and base infrastructure equipment at the provider’s data center, but typically most contracts are structured so that the service provider makes a set of infrastructure equipment available within an agreed upon amount of time following the declaration of a disaster. This equipment contracted from the service provider typically consists of the compute capacity, additional storage resources, and a combination of network and security appliances required to recover protected production workloads.
The equipment provided at time of disaster or for planned recovery testing events must be connected to the enterprise customer’s network and configured with firmware levels that mirror the enterprise customer’s production environment before any workloads can be restored. The logistical complexity and coordination of this activity limits the recovery time objectives (RTOs) that can be achieved by the overall solution.
For the disaster recovery plan to remain functional and effective, organizations must make sure they have a process in place to update their disaster recovery equipment needs and configuration procedures as applications are added, altered and eliminated and as infrastructure services such as virtualization infrastructure, load balancing appliances and perimeter security appliances are altered in the production environment.
Cloud-based disaster recovery solutions can help organizations reduce disaster recovery costs, improve recovery time objectives and enable new testing capabilities
As Public Cloud platforms such as Amazon Web Services (AWS) and Microsoft Azure have matured their offerings and increased market presence, traditional storage replication and backup software vendors have repositioned their products to leverage low-cost cloud storage services. There are several workload replication solutions on the market that enable organizations to synchronize their on-premise production workload images and data to storage on AWS and Azure while also providing the intelligence to translate on premise VMware-based virtual workloads to either AWS EC2 instances or Azure virtual machines. There are three notable benefits achieved with this cloud-based disaster recovery approach:
1. Steady state compute costs are minimized
Disaster recovery plans that leverage public cloud platforms such as AWS and Azure utilize on-demand compute resources to run protected workloads. Other than periodic testing or actual disaster recovery events, the result is minimal ongoing compute costs. The steady-state infrastructure costs of a cloud-based disaster recovery approach typically include the cloud storage usage charges to support replicated data, dedicated network bandwidth to support the replication processes, replication software costs and a small amount compute and storage for steady-state infrastructure services such as Active Directory and the replication management endpoint. In contrast, traditional disaster recovery solutions, organizations incur significant capital expenses to purchase compute hardware for disaster recovery locations or increased contract costs from their service provider to set aside large amounts of compute to support the recovery plan.
2. Recovery processes become increasingly automated
Since cloud disaster recovery plans are implemented in software, large portions of the recovery execution plan can be automated. Although workloads can be replicated and ready to restore at a traditional disaster service provider location, the time that it takes to get the hardware positioned and firmware ready can have dramatic impacts on RTOs. Even if an organization owns its disaster recovery location and has hardware ready to go, most of the network and security services need to be explicitly configured. Scripting can be used to facilitate the configuration of those devices, but because disparate hardware platforms are involved, there is still manual effort involved in the invocation of those scripts. Public cloud disaster recovery scenarios provide the ability to automate the configuration of those network and security services more consistently.
3. Recovery testing becomes much more flexible
Testing flexibility is a valuable benefit of cloud-based disaster recovery solutions. Since it’s often not practical to alter the IP addresses of recovered workloads, traditional disaster recovery tests must either be done in isolation or the production workloads must be taken offline so that the recovered instance of the application doesn’t interfere with the production instance. Isolating the recovery environment from production users can result in the need for a planned change control activity to reconfigure network devices in the enterprise network. When organizations use service provider contracts for disaster recovery, there are often only a limited number of tests built into each contract. Any additional disaster recovery testing must be paid for separately. Cloud-based disaster recovery solutions make it easy for isolated network environments to be spun up and down quickly to support various testing scenarios. Not only does that enable more flexible and frequent disaster recovery testing, but it also opens the door to support other testing needs, like security patch testing or application release testing.
Customized cost analysis, robust and secure connectivity and cloud operational methods are key elements of an effective cloud-based disaster recovery solution
Although we have identified some attractive benefits associated with cloud-based disaster recovery approaches, there are some implementation considerations that must be addressed to achieve the full value of these solutions.
Software that a company chooses to support replication to the cloud should be closely analyzed. Many companies have existing software investments which support workload replication between data centers. It may turn out that those existing products require significant license upgrades to take advantage of cloud replication. Those costs should be closely analyzed and compared against the impact of pivoting to any cheaper alternatives that may exist.
To successfully implement a cloud-based disaster recovery solution, and organization should have a well-developed secure and robust cloud connectivity design. Virtual private networks with the proper connectivity and security configuration will be required to support the recovered workloads. See our previous blog on this topic for a more in-depth discussion of these implications.
Finally, organizations will need to develop operational monitoring practices and methods to support recovered workloads in the cloud. This might entail extending and applying existing tools and procedures to the cloud workloads, but most likely it will involve developing at least some rudimentary new cloud platform operational knowledge and methods.
It is often argued that the ideal use case for Public Cloud is the development of cloud-native applications that can take advantage of cloud platform services such as serverless compute, auto-scaling resources and large data analytics processing capabilities. A strong case can be made, that by transitioning to a cloud-based disaster recovery for existing workloads, organizations can immediately continue their cloud adoption journey while delivering business value.