From the Frontlines: Real-world DR Plan Test
It’s a common story. An organization knows its IT systems are critical for keeping the business operational but has a lean staff and many competing priorities. While it may have a data backup plan in place, a comprehensive disaster recovery (DR) solution is seldom top-of-mind. The problem: the organization – like many companies – is hedging its bets that it won’t suffer from a disaster of any kind.
But, as we know, disasters can and do happen. Even with a DR solution in place, things can go terribly wrong if the solution wasn’t previously tested to ensure its effectiveness. Successfully recovering from a disaster also can hinge on whether the organization follows through with the DR playbook that accompanies the DR plan.
At US Signal, we’ve seen this scenario play out numerous times. One case, in particular, demonstrates how important DR testing and playbooks are ─ as well as having a comprehensive, customized solution.
Beyond Backup
A customer with several remote sites had contracted US Signal to create an offsite backup solution. After researching the customer’s business requirements, US Signal determined that more than backup would be needed to meet the organization’s data protection needs. The better solution was one that centered around US Signal’s Cloud Availability for Veeam. The Veeam-based service provides a comprehensive approach to data protection, with solutions for local backup, offsite backup and replication, as well as DR site services.
Specifically, the solution would draw from the replication and backup solutions provided by Cloud Availability for Veeam. Additional services would be added up to fill in any gaps.
The Solution Overview
US Signal’s Cloud Replication for Veeam offers a flexible cloud-based replication target for on-premises virtual machines (VM's). There’s no need to maintain a second site. The protected workloads replicate directly into vCloud Director, creating ready-to-start copies of customer VMs in a secure US Signal resource pool.
To restore files, the customer simply fails over to the replication environment and back to source as needed. RPOs as low as 15 minutes can be achieved as well as an RTO of minutes to hours.
Likewise, Cloud Backup for Veeam eliminates the cost and complexity of building and maintaining off-site infrastructure, and provides a fast, secure way to get VM backups off-site and into the US Signal Cloud. All backups and offsite copies are managed and recovered directly from the customer’s Veeam backup console.
Cloud Backup also includes an important feature: Insider Protection. It adds an extra level of security for backups and is particularly helpful in protecting backups from ransomware. If a backup file is deleted accidentally or maliciously, it’s retained in an air-gapped directory, which functions as a recycle bin. This isolated folder isn’t visible to the customer or public routing.
The deleted backup files ─ both full backups (VBK files) and incremental backups (VIB) ─ remain in the recycle bin for a specified period but don’t consume the customer’s storage quota.
Once the customer is ready to restore its data, US Signal can transfer the customer’s files back over the network or on a portable drive. It can then be imported back into the customer’s Veeam Backup and Replication console. The overall solution was architected to meet the specific RTOs and RPOs for the customer’s applications. Cloud replication was used for workloads that needed to be back up within four-to-six-hour windows.
US Signal’s Zerto-based managed DR solution was implemented for applications that needed to be in one hour or less since the customer already had the necessary networking configured at its various sites.
Ready for Action
The DR solution was deployed in early 2021, and US Signal conducted a proof of completion test to confirm the network was configured correctly and to ensure the applications would fail over as expected.
Importantly, US Signal stayed engaged with the customer, jointly reviewing its DR playbook monthly and scheduling regular tests to ensure that than any network changes that happened over the course of the year were accounted for. The idea was to make sure that if a failover was needed, the team would not waste precious time troubleshooting network issues in the middle of a data center crisis.
DR Plan Execution
A few months later the customer contacted US Signal. A breach was suspected and the customer wanted US Signal on standby. Within hours, it was confirmed that there had been a successful ransomware attack. US Signal jumped into action, executing on the playbook and staying in communication with the customer’s engineering team.
Servers were failed over and the networking cutover so no cross traffic would affect the newly spun up servers. Everything was vetted. Passwords were changed. While there was an incident that caused a delay in reestablishing the domain administrator server within the DR environment, but that was quickly mitigated by US Signal’s team.
One of the things US Signal had done previously was move all its customers to three-day journal unless they requested to stay on an eight-hour or one-day journal history schedule. That proved beneficial for this particular customer because of the time it took to identify the breach and then fail over to the DR sites. US Signal was also able to extend the journal history out another two weeks, giving the customer additional time to go back and pull the data they needed.
Another plus for the customer was that the US Signal team took meticulous notes at every step. The team could go back into its notes and tell the customer almost minute by minute what had been done.
Unfortunately, because the customer was self-managing its replication to the cloud, the cyber attacker was able to take control of Veeam and start deleting disks on both sides. The good news, however, was that because of the Insider Protection feature in Cloud Backup for Veeam, we had backups from at least 30 days previous. We were able to say, “we know you lost all your all the disks that could give you a fully functioning VM, but we have all your data.”
Throughout the recovery process, US Signal took every precaution to ensure the integrity of the recovered systems. The team also worked closely with the SOC desk to identify the method of attack, and was able to provide that information to the customer, its insurance company and its legal team.
DR Success
While all the details aren’t provided in this customer example, there are two critical takeaways from this case. First, the DR plan worked, and the customer was able get it back up and running. Second, but equally important, all the customer’s data was safe.
Working with a DR plan that had been tested contributed to the customer’s success, as did the multi-faceted, customized nature of its DR plan. US Signal’s continued interaction with the customer also made a difference. For US Signal, it’s not just about selling a service. It’s about forming a partnership with our customers, and working together to achieve their goals, protect their IT assets, and keep them operational.
If you’re interested in learning more about our DR solutions or how we can help your organization, let us know. Call 866.2.SIGNAL or email us at: [email protected].