Proving the Program Series

What Your Maturity Score Doesn't Show

Why Adversarial Testing Reveals What Self-Assessment Can't

In 2022, CISA's red team assessed a large critical infrastructure organization. Leadership described their security posture as mature. They had documented programs, defined controls and processes built around established frameworks. When CISA's team arrived, it gained persistent access, moved laterally across multiple sites and reached systems adjacent to sensitive business operations. The organization detected none of it -- not during active compromise, not when the red team deliberately tried to trigger a security response.

That's the maturity mirage in documented form. Mature on paper. Undetected in practice.

The Measurement Problem Nobody Talks About

Most organizations measure OT security maturity by asking their own teams how mature they think they are. According to Fortinet's 2025 State of Operational Technology and Cybersecurity Report, 81% of organizations globally self-assess at Level 3 or Level 4 on a five-level scale, with 46% claiming Level 4, the highest tier, characterized by continuous improvement and integrated threat intelligence. Every one of those numbers came from organizations evaluating themselves.

Now look at what professional assessments find. According to Dragos's 8th Annual OT Cybersecurity Year in Review, covering engagements conducted in 2024: 45% of professionally assessed OT networks had inadequate visibility, despite organizations typically believing they had it. 65% of assessed sites had insecure remote access conditions including default credentials, unpatched VPNs and exposed remote desktop sessions. Many had no OT-specific incident response plan despite claiming mature security programs.

Two different pictures of the same industry. Both can't be right.

What the Frameworks Actually Measure

Here's something the leading international standard for industrial cybersecurity states explicitly: there is no relationship between maturity level and security level. According to exida, an authoritative source on IEC 62443, an organization can develop and maintain a security Level 4 system using a maturity Level 1 process. The reverse is equally true. A high maturity score means your documentation is thorough. It does not mean your controls stop someone who is actively trying to bypass them.

That's not a flaw in IEC 62443. It's a design feature organizations systematically misread. The standard's maturity levels, modeled on CMMI (Capability Maturity Model Integration), measure how well security processes are defined, practiced and consistently followed. They measure documentation quality and process discipline. They do not measure adversarial resistance.

NIST SP 800-82, the U.S. government's primary OT security guidance, works the same way. Control baselines are organized and verified against checklists. Compliance is self-assessed. Adversarial testing is not required to demonstrate compliance at any level. CISA's own Cyber Security Evaluation Tool allows organizations to self-evaluate against NIST 800-82 by answering a questionnaire -- no external challenge required.

Both frameworks are genuinely useful. Neither was designed to answer the question an adversary answers on day one of an engagement: does this actually work? That compliance-documentation gap is the foundation of what Article 8 in this series examines directly: the specific ways compliance-based assessment produces a maturity picture that looks complete on paper while leaving gaps that assessments pass and adversaries use.

The Gaps Red Teams Find Every Time

The gap between documented maturity and tested capability tends to appear in the same places. Segmentation that looks clean on a network diagram has evolved into flat architectures with undocumented cross-zone pathways. Industrial protocols including Modbus and DNP3 transmit in plaintext with no authentication and are routinely left unmonitored. Even OPC-UA, which includes built-in security capabilities, is frequently deployed without those features enabled. Detection capabilities generate alerts nobody investigates. Vendor remote access connections established for a maintenance window months ago remain open and unreviewed.

Our Shadow Current series has traced how these conditions translate into exploitable attack paths, how adversaries move through unmonitored protocol traffic and assumed-secure segment boundaries that organizations believe they've closed. The Red Team applies those same paths in a controlled engagement, documenting not just that the path exists but whether your defenses detect it.

According to SANS Institute's 2025 State of ICS/OT Cybersecurity survey, only 3% of organizations have full penetration testing coverage of their OT environments. Only 8% have full detection coverage at field and remote sites. Only 14% feel fully prepared for emerging threats.

These are not theoretical gaps. They are the conditions that allowed CISA's 2022 red team to operate undetected against a self-described mature program. In a separate 2024 advisory, CISA described an engagement where leadership had minimized the business risk of a known attack vector their own security team had already identified. The red team exploited it and reached OT human-machine interface access. That pattern -- leadership deprioritizing a risk their own team had already flagged -- connects directly to the Industrial CISO gap the first article in this series documented. When OT security lacks an accountable executive owner, assessment findings don't get escalated with appropriate urgency. The leadership vacancy doesn't just produce untested controls; it produces the organizational conditions that keep them untested.

The Confidence Paradox

The organizations most certain about their maturity are often the ones with the largest gaps. A 2025 identity security study by BeyondID found that organizations rating themselves "Advanced" followed only 4.7 out of 12 security best practices on average, fewer than those rating themselves merely "Established," who followed 5.1. Higher confidence, fewer controls in practice.

There's a broader data point that frames the scale of the problem. Gartner research, cited in analysis following the Colonial Pipeline incident, found that 60% of critical infrastructure organizations were at the earliest stages of OT security maturity. The same period saw organizations self-reporting at 81% Level 3 or higher. That is not a rounding error. It reflects a structural disconnect between how organizations measure themselves and what external assessment finds.

Dragos's 2026 OT Cybersecurity Year in Review adds a concrete operational dimension to the gap. Organizations with comprehensive OT visibility detected and contained ransomware incidents in an average of five days. The industry average was 42 days. The difference between those two numbers is the difference between a contained incident and a production shutdown.

What Adversarially Validated Maturity Means

A Red Team assessment does not produce a maturity score. It produces a map. Where the team entered. Which controls they bypassed and how. Where detection failed. What the actual path to high-value OT systems looks like in your specific environment under realistic adversarial pressure.

MITRE ATT&CK for ICS provides the technique library. An OT-focused Red Team applies the specific tactics, techniques and procedures documented threat actors use against industrial environments, then records which defenses held and which didn't. The output is evidence, not assumption.

That evidence replaces the hypothesis. Organizations that know which controls actually performed can direct investment toward capabilities with demonstrated gaps rather than documented ones. The vCISO series covers how the OT vCISO establishes that honest baseline through structured assessment, with the Red Team's adversarial findings serving as the external component that makes that baseline accurate rather than assumed. The CFC series covers how the fusion center conducts the structured methodology assessment that programs use as their starting point, and what it means when Red Team validation puts those capabilities under realistic adversarial pressure.

The recalibration that follows a Red Team engagement is not a verdict on the security program. It's the information that makes every other part of the program honest.

The Question Every Score Should Answer

Has anyone actually tried to break this? If the answer is no, the score is a hypothesis. The Red Team is how you test it -- and how you keep testing it, because a recalibrated program that stops being challenged eventually drifts back toward the conditions that made testing necessary in the first place.

What Comes Next

Maturity tells you the strength of your defenses. Threat intelligence tells you what those defenses need to stop. When the intelligence picture is incomplete or disconnected from real adversary behavior, even a genuinely mature program can be aimed at the wrong targets. That's where we go next.

What Comes Next

Learn more about the OT vCISO role in The Missing Leadership Layer in Industrial Cybersecurity -- an executive brief covering why adversarial testing and program leadership work together to produce a defense posture that holds under pressure.

A score is a hypothesis. The Red Team is how you test it.

81% self-assess as mature.
65% have insecure remote access.
A maturity score is a hypothesis. The Red Team is how you test it.

81% Self-assess at OT maturity Level 3 or 4
65% Of assessed sites have insecure remote access conditions
3% Have full OT penetration testing coverage
42 Days Industry average ransomware containment (vs. 5 days with full OT visibility)
🎭 The Maturity Mirage

CISA's 2022 red team gained persistent access, moved laterally across multiple sites, and reached sensitive systems. The organization detected none of it.

Click to explore →

The organization self-described as mature. Documented programs. Defined controls. Framework-aligned processes. The red team deliberately tried to trigger a security response. Nothing fired. Mature on paper. Undetected in practice. That is not an edge case -- it is the pattern CISA documented in advisory AA23-059A.

📋 The Framework Gap

IEC 62443 states explicitly: there is no relationship between maturity level and security level. Your score measures documentation. Not adversarial resistance.

Click to explore →

Maturity levels measure process discipline and documentation quality. NIST SP 800-82 uses checklists. Compliance is self-assessed. Adversarial testing is not required at any level. CISA's own evaluation tool is a questionnaire. Both frameworks are useful. Neither answers what an attacker answers on day one: does this actually work?

🔓 Segmentation and Protocol Gaps

Clean network diagrams evolve into flat architectures. Modbus and DNP3 run in plaintext with no authentication. Vendor access from months ago stays open.

Click to explore →

45% of professionally assessed OT networks had inadequate visibility despite believing they had it. OPC-UA is frequently deployed without its security features enabled. Detection systems generate alerts nobody investigates. These are the Shadow Current conditions -- the undocumented pathways adversaries find when organizations believe they've closed them.

📊 The Confidence Paradox

"Advanced" organizations follow 4.7 out of 12 security best practices on average. "Established" organizations follow 5.1. Higher self-confidence, fewer controls in practice.

Click to explore →

Gartner found 60% of critical infrastructure organizations at the earliest stages of OT security maturity. The same period: 81% self-reporting Level 3 or higher. That is not a rounding error. It is a structural disconnect between self-measurement and external assessment that runs across the industry.

🗺 What the Red Team Produces

A Red Team assessment doesn't produce a maturity score. It produces a map. Where the team entered. Which controls they bypassed. Where detection failed.

Click to explore →

Using MITRE ATT&CK for ICS, the team applies specific tactics, techniques and procedures documented threat actors use against industrial environments. Which defenses held. Which didn't. The output is evidence, not assumption. Organizations direct investment toward capabilities with demonstrated gaps rather than documented ones.

What CISA Red Teams Actually Find The 2022 persistent access engagement and the 2024 HMI advisory

CISA advisory AA23-059A documented the 2022 engagement in detail. The red team was assessing a large critical infrastructure organization that described its security posture as mature with comprehensive controls and frameworks in place. The team gained persistent access to the environment, moved laterally across multiple sites, and reached systems adjacent to sensitive business operations. They deliberately triggered activities designed to produce security alerts. The organization detected none of it.

A separate 2024 CISA advisory described a different engagement with a similar structural failure. Leadership had minimized the business risk of a known attack vector that their own security team had already identified and flagged. The red team exploited it and reached OT human-machine interface access.

That second pattern is critically important. The issue wasn't a technical gap nobody knew about. The gap had been identified internally. The failure was organizational: the finding was not escalated with appropriate urgency. When OT security lacks an accountable executive owner, assessment findings don't produce action. The leadership vacancy doesn't just produce untested controls -- it produces the conditions that keep them untested regardless of what the maturity score shows.

The Framework Design Problem Why IEC 62443 and NIST SP 800-82 can't answer the adversarial question

IEC 62443's maturity model is derived from CMMI, the Capability Maturity Model Integration. CMMI measures process discipline: are processes defined, practiced, consistently followed, and improving over time. That is a genuinely valuable measurement. It tells you whether your security program is organizationally sound. It does not tell you whether your controls will hold under adversarial pressure.

The standard itself acknowledges this. According to exida, an authoritative IEC 62443 source, an organization can achieve maturity Level 4 while operating a security Level 1 system. The maturity level and the security level are independently measured. Organizations that treat a high maturity score as evidence of adversarial resilience are misreading what the standard was designed to assess.

NIST SP 800-82 has the same structural limitation in this context. The standard's control baselines are verified against checklists. Compliance is self-assessed. Adversarial testing is not required at any level. CISA's Cyber Security Evaluation Tool, designed to help organizations assess against NIST 800-82, operates as a questionnaire with no external validation component. These tools are valuable for establishing a documented program baseline. They were not designed to measure adversarial resistance, and they don't.

Five Days vs. Forty-Two Days What adversarially validated visibility actually produces in operational outcomes

Dragos's 2026 OT Cybersecurity Year in Review provides the clearest operational measurement of what the visibility and testing gap produces. Organizations with comprehensive OT visibility detected and contained ransomware incidents in an average of five days. The industry-wide average was 42 days.

That 37-day difference is not an abstraction. In a production environment, 42 days of undetected compromise means weeks of adversary access to process data, control systems, and operational network pathways. It means the scope of recovery is measured in weeks rather than days. OT system recovery is already 3-4x more costly than IT recovery under the best conditions. The gap between five days and 42 days is the difference between a contained incident and an extended shutdown.

The organizations operating at the five-day end of that range are not necessarily the ones with the highest maturity scores. They are the ones whose visibility and detection capabilities have been validated against real adversarial techniques, whose alerts are connected to response authority, and whose programs are recalibrated regularly against what threats actually look like. That is what adversarially validated maturity produces -- not a higher score, but an operational outcome that demonstrates the program actually works.

A Score Is a Hypothesis

Until tested adversarially, every maturity score is an assumption about performance, not evidence of it. CISA tested a mature organization and found it undetected. The score was accurate. The assumption behind it wasn't.

Frameworks Measure Process, Not Resistance

IEC 62443 and NIST SP 800-82 were designed to measure documentation quality and process discipline. Both are useful. Neither answers what an attacker answers on day one of an engagement.

The Red Team Tests the Hypothesis

And keeps testing it. A recalibrated program that stops being challenged drifts back toward the conditions that made testing necessary. The five-day vs. 42-day containment split is what continuous adversarial validation actually produces.

The question every maturity score should eventually have to answer is simple: has anyone actually tried to break this?

A score is a hypothesis. The Red Team is how you test it.

Scroll to Top