What is Resilience Testing and Should We Be Doing It?

November 20, 2025

Table of Contents

Last Updated on November 23, 2025

The cybersecurity paradigm is shifting from trying to prevent inevitable attacks to building resilience to withstand them. New regulations like the EU’s Digital Operational Resilience Act (DORA) now mandate that businesses worldwide prove their operations can handle significant disruptions, especially cyberattacks.

A central element of this regulatory and market shift is resilience testing, such as threat-led penetration testing. Similar to scenario-based exercises like Red Teaming but with a tighter focus, resilience testing seeks to reduce real-world risks from specific cyberattack vectors or other disruptive events.

What does scenario testing demand, who needs to perform it, and what benefits does it offer for companies looking to achieve resilience? This article gives business and technical leaders the essential points.

Key takeaways

The growing relevance of resilience testing reflects a paradigm shift from trying to prevent inevitable cyberattacks to developing digital resilience to withstand disruption.
The goals of resilience testing are to measure, report on, and continuously improve resilience by pinpointing vulnerabilities and enhancing operational processes.
Unlike traditional cybersecurity testing like penetration testing and vulnerability assessments, resilience testing involves highly realistic simulation of specific attacks.
Resilience testing supports rapid incident response and reporting, which is increasingly a compliance requirement for organizations across sectors worldwide.
“Live fire” exercises like Cloud War Games can support resilience and reduce downtime impacts by helping technical staff build key incident response and IT recovery skills.
Resilience testing is most important for critical infrastructure organizations and others holding sensitive data or supporting vital processes, especially in financial services, healthcare, government, and IT services.

What is resilience testing?

Resilience testing is a form of scenario testing that simulates a high-probability disruptive event, such as a ransomware attack, to evaluate an organization’s incident response and ability to keep critical services up and running. Periodic resilience testing intends to strengthen incident management, keep economic and societal vectors stable, and protect even from novel threats.

Coupled with related compliance requirements like next-level incident management and accelerated incident reporting, resilience testing helps ensure that orgs can efficiently detect, block, or blunt the impacts of data breaches and other major digital disruptions. Prominent regulations including DORA and NIS2 require resilience testing not just for “essential and important” organizations like banks, energy providers, and transportation services, but also for their highest-risk suppliers and vendors.

Another hallmark of resilience testing is systematically leveraging lessons learned to proactively address vulnerabilities, refine response procedures, and ultimately build a culture of resilience versus just improving security performance. Orgs must document testing results that show continuous improvement to maintain compliance.

How does resilience testing differ from conventional cybersecurity testing?

Some of the ways that resilience testing differs from conventional cybersecurity testing like incident response drills or penetration tests include:

Resilience testing is increasingly a compliance requirement, not simply a recommended best practice.
Resilience testing is often required more frequently than traditional tests.
Resilience testing targets highly realistic simulation of specific attacks.
Resilience testing can be more diverse, encompassing disaster recovery drills, business continuity exercises, and threat-led penetration tests.
Resilience testing should simulate “severe but plausible” events like data breaches, supply chain hacks, and nation-state cyberattacks.
Resilience testing yields relevant metrics like Mean Time to Detect (MTTD), Mean Time to Respond (MTTR), and Mean Time to Disclose (MTTD).
Resilience regulations mandate documenting both results and risk mitigation steps.
While traditional vulnerability assessments and penetration tests may have been viewed as “check the box” compliance exercises, resilience testing is seen as a strategic approach to reducing disruptive risks.
Many regulations extend resilience testing to critical third-party partners and vendors, especially IT service providers like cloud service providers (CSPs) and managed security service providers (MSSPs).

Resilience-centric regulations like DORA and NIS2 go well beyond basic cybersecurity requirements to encompass a firm’s accountability to customers, the market, and society at large. For example, critical infrastructure businesses are increasingly required to build “security by design” into products and services, chart real-time, continuous compliance and performance data, respond decisively and transparently to breaches and incidents, and comprehensively train employees in resilience practices.

How does resilience testing connect to rapid incident reporting?

Along with an emphasis on resilience, the new wave of cybersecurity and data protection frameworks mandates stringent incident reporting requirements. DORA, NIS2, the UK’s Cyber Security and Resilience Bill, the Cyber Incident Reporting for Critical Infrastructure Act (CIRCIA), and the US Securities and Exchange Commission (SEC) rules all exemplify this two-pronged relationship.

Improving resilience testing alongside incident reporting gives regulators, government entities, industry bodies, and supply chain partners fresher and more comprehensive threat intelligence for scenario development and testing purposes. A rapidly escalating vector for advanced cyber threats today is the partnering of nation state adversaries with cybercriminals to pursue strategic political objectives that are tantamount to cyber warfare.

Non-government entities are frequent targets of these sophisticated attacks, as they are seen as less well defended. By reporting more incidents sooner and in more detail, the hope is that peer organizations can learn, adapt, and ensure business continuity with reduced impacts—not simply recalibrate preventive measures.

Can “live fire” exercises help build resilience?

Resilience testing’s overall risk management mindset has been characterized as asking the question, “If we suffered a data breach right now, what would break?”

An emerging scenario testing methodology that emphasizes real-world stress testing for IT/operations teams is “war games.” An innovative model is Cloud War Games, a hands-on, real-time Amazon Web Services (AWS) infrastructure simulation where competitors face high-pressure scenarios that mimic major downtime events.

According to Matt Lea, Founder at Schematical.com and creator of Cloud War Games, “I saw a lot of young, aspiring cloud professionals just freeze up when servers went down and clients were losing $100,000 an hour. I wanted to create a training scenario that lets them think it through and actually fix things or break things—not just take a test or follow instructions.”

This kind of highly realistic incident response exercise, even if not configured to meet specific regulatory requirements, could be a powerful way to train IT staff on how to respond to cyberattacks and other disruptions. Cloud War Games scenarios can utilize either generic AWS infrastructure or a company’s own AWS environment.

Should my company perform resilience testing?

Any business that holds, shares, or processes highly sensitive data, participates in a critical supply chain, or provides economically and/or socially important services is subject to sophisticated and potentially devastating cyberattacks and therefore needs to build resilience.

Consistent with the elevated threat level, threat-led penetration testing and other forms of resilience testing are increasingly a compliance requirement for critical infrastructure entities and their third-party IT service providers (e.g., CSPs, MSSPs, and others with whom they share sensitive data).

Any org whose data breach or service outage could spawn wide-scale destructive impacts should begin focusing on resilience if they are not already required to do so—legally and/or competitively. Sectors that will lead in resilience testing adoption include:

Financial services, thanks to DORA’s growing global influence.
Healthcare providers and their supply chain partners, because lives and health could depend on their service availability.
The technology sector, from AWS and IBM to small SaaS providers depending on their offerings.
Government, especially agencies that deliver essential services to the public or must protect highly sensitive data.

How can resilience testing benefit my company?

Resilience testing and other forms of scenario testing offer a wide range of benefits, including:

Data to validate resilience for regulators, customers, management, and other stakeholders.
Provably stronger incident response capabilities.
Stronger trust and peace of mind among clients, partners, management, investors, and other stakeholders.
Reduced downtime risk for critical services.
Improved resilience against unknown future attacks.
Proactive elimination of vulnerabilities in systems, procedures, and plans before attackers find them.
A better view of the attack surface and how to close gaps and reduce risk.
Better informed strategic decisions and priorities around cybersecurity, privacy, and overall IT investment planning.
Enhanced agility and responsiveness in the face of unexpected events.
Greater transparency and accountability to improve public relations following an incident.

By conducting targeted resilience testing in “severe but plausible” cyberattack scenarios and other disruptive situations, organizations can not only reduce risks from negative outcomes, but also can ensure that their essential services remain operational and available.