Part 4 of 4 — A CYBER CERBERUS CASE STUDY — Disaster recovery in Critical/Sensitive Operations.
In 2016, the Auditor General found that 64% of audited agencies in the Western Australian public sector did not have adequate disaster recovery and business continuity arrangements in place.
While engaged with an organisation in 2019 and 2020, where after an audit (similar to those outlined in earlier OAG reports), we identified that a sample of systems did not meet adequate Disaster Recovery configuration and management. In fact, these systems failed to meet even the minimum recovery requirements for the type of infrastructure/applications assessed.
Cyber Cerberus was engaged to complete a review of critical applications and infrastructure to determine the current state of disaster recovery maturity, identify primary risks, and recommend remediation activities.
Working in a large, complex environment with a network of varied, time-poor stakeholders, this project presented our team with several challenges. Here are just a few:
Every stakeholder we engaged presented conflicting viewpoints on the status of disaster recovery management processes within the agency.
There was a distinct lack of consensus around which systems are more critical than others — “they are ALL important!”.
Definitions on what constituted a disaster varied.
Management was misaligned on where to start with the recovery of critical IT infrastructure, following an event.
These differing opinions created variations to the risk posture for the same system and identified a need for clear policies, processes, and procedures to support an effective disaster recovery approach and management.
To help sift through the priorities, we examined the risk appetite of the business, we looked at their service providers and their ability to recover the systems from a cyber incident. Consulting and advising around disaster recovery requires significant key stakeholder engagement. There is a massive people element when it comes to recovering systems, especially critical applications which may affect people’s livelihoods.
Using our tailored methodology, we identified that a significant number of systems were without, or lacked, relevant Disaster Recovery Plans. There was a lack of knowledge around the key topics of Recovery Time Objective (RTO) and Recovery Point Objective (RPO) i.e. how long can the system be offline and how much data is the organisation willing to lose through the recovery process? (More on RTO and RPO here - The 3 Common Pitfalls with Disaster Recovery Plans. Plus, How to Avoid Them.)
We developed a solution that ensured a disaster recovery plan was adequately built around the identified most critical systems (based on evidence, not opinion). We also educated relevant people on the critical RTO and RPO values, and how disaster recovery must be created/updated for systems throughout the project lifecycle.
What we learned:
We learned that no matter how big or how much IT budget an organisation is allocated, Disaster Recovery does not receive the correct attention.
Disaster Recovery is often misunderstood and mistaken for high availability.
High availability is the concept of building systems with resilience in mind. If part of the system went down, you could still run operate as normal, like a generator for your IT systems (resilient system design). This type of occurrence is termed an “incident”.
However, Disaster Recovery, as the name suggests, means building processes, procedures, and policies in preparation for systems being compromised, requiring they be rebuilt from scratch. This type of occurrence is termed a “disaster”.
We also learned that there are many ways to rationalise Disaster Recovery. But there are some great and easy ways of doing Disaster Recovery with the right support and stakeholders involved.
What’s the takeaway?
The lack of adequate Disaster Recovery plans largely comes down to a lack of knowledge.
What is a DRP?
Where do I find ours?
How do I know if it’s adequate?
This is why we have dedicated several articles lately to the explanation and education of this topic area.
We understand that it all sounds like a complicated process. You are not alone in feeling overwhelmed by identifying these new unknowns. But that is why we urge you to read our articles and identify areas of your DRP that need updating. We welcome you to contact us in assisting you in further uncomplicating this essential component of your organisation’s overall business continuity governance.