In the past year there have been several high-profile outage incidents, affecting a wide range of organizations. In the first month of 2017 alone, we have seen both Delta Airlines and United Airlines cancel flights due to major IT issues and the internet streaming services of both Comcast and Fox Sports experiencing outages during Super Bowl 51, leaving some fans unable to witness the nail biting ending.
These follow on from similar incidents last year that affected Amazon Web Services after storms hit Sydney, Australia in June, where services in the region were down for around 10 hours, disrupting a range of services from banking to pizza deliveries. And of course, cyberattacks also took their toll: we saw the worlds’ biggest ever DDoS attack targeting the company which controls much of the internet’s domain name system.
Despite these high profile incidents, many businesses are still stuck in the mind-set of ‘it won’t happen to me’ and are ill-prepared for IT failures. And with IT teams facing a broad range of unpredictable challenges while maintaining ‘business as usual’ operations, this mind-set places organizations at serious risk of a damaging, costly outage. Therefore, it’s more important than ever to have plans for responding and recovering as quickly as possible when a serious incident strikes. As the author Franz Kafka put it, it’s better to have and not need, than to need and not have. In short, effective disaster recovery is a critical component of a business’ overall cybersecurity posture.
Most large organizations do have a contingency plan in place in case its primary site is hit by a catastrophic outage – which, remember, could just as easily be a physical or environmental problem like a fire or flood, as well as a cyberattack. This involves having a disaster recovery (DR) site in another city or even another country, which replicates all the infrastructure that is used at the primary site. However, a key piece of this infrastructure is often overlooked – network security – which must also be replicated on the DR site in order for the applications to function yet remain secure when the DR site is activated.
Building security into DR
Replicating the security infrastructure, however, can be more of a challenge than it may initially appear. The network at the primary site will contain routers, firewalls, servers and so on, and the DR site may be set up in exactly the same way. But the problem is, just installing the same equipment in the same configuration isn’t enough. All of those devices have security policies within them and these policies change on a daily or even an hourly basis, every time applications and users are added, amended or removed.
As such, whenever a policy change is made in the primary site it is critical to ensure that an equivalent change is made on the DR site. This requires synchronization between the two sites’ security policies to automatically replicate policies every time they change. How that synchronization is implemented will depend on the exact equipment and setup the organization is using, and it’s not always easy to do.
The most straightforward scenario is when the same equipment from the same vendor is deployed at each site – and that vendor offers a unified firewall management system. This means the same policies can be simultaneously installed on security devices on both sites; IT teams only have to make the change once in the firewall management system, and it’ll push the change out to the security devices in each site.
Overcoming language barriers
More complex scenarios occur when organizations don’t have such a firewall management system – or if the organization is using equipment from different vendors, or different models from the same vendor, at each site. In this setup, the policies at the two sites will not be truly identical, so synchronizing the two sites’ policies will need to be done in some other way. And if you rely only on human processes to synchronize the two sides – the polices will eventually diverge. Therefore, an automated system is the right approach to maintaining the synchronization.
The last thing to consider is the IP addresses in use at the primary and secondary sites. Are they identical or, as is more likely, are the IP addresses on the main site mapped to their logical counterparts in the DR site? In this case, any rules that are installed at the secondary site are going to look slightly different to the ones at the primary site – and again, you will need an automation solution to carry out the rule conversion.
It is essential to consider all of these aspects of security policy management when building a DR site. If you neglect these points, when disaster happens and you need to switch operations to your secondary site, then your systems and applications won’t work as you need them to.
As we’ve seen with recent serious outages, prevention is no longer enough to ensure robust readiness to unplanned incidents and cyber threats. Organizations also need to ensure that their incident response is as slick and unified as possible, so that when (not if) the worst happens, they can get critical systems back up and running quickly, to cut disruption to a bare minimum. And having your security policies configured and orchestrated across the entire organization, in both primary and DR sites, is a critical facet of this.