Watch any seasoned L2 analyst run an incident. They will, somewhere in the first ten minutes, draw a small diagram on a notepad or a whiteboard. Two boxes, an arrow, a third box, a question mark. They are sketching the blast radius — the set of things downstream of the compromise that need to be looked at, contained, or rebuilt. The sketch is rarely complete. It is almost always wrong by hour two. And the next analyst on shift will redraw it from scratch.

That sketch is the single most operationally important artefact of the entire incident, and most platforms treat it as analyst tribal knowledge. Blast radius is computed in the analyst's head, communicated verbally, and lost when the shift changes. It is somehow simultaneously the most important number in the incident and the least systematically captured.

The argument of this post is simple. Blast radius is a structural property of the environment, not a cognitive artefact of an analyst. It should be a query. The same query, every time. Returning the same answer for the same incident, regardless of who is on shift.

What blast radius actually is

Blast radius is the set of nodes — assets, identities, data stores, services, downstream consumers — reachable from the compromised entry point under the assumption that the attacker has the access of the compromised entity. It is not "everything the compromised asset can technically talk to". It is "everything an attacker would plausibly pivot to from here, given the access model in place".

Three things have to be true for the blast radius to be computable:

  1. The graph has to know what the entry point is. That comes from detection.
  2. The graph has to know what edges exist from that node. That comes from telemetry — network, identity, application, service mesh.
  3. The graph has to know which edges represent access, not just connectivity. A node is reachable on the network only if the identity has rights, the firewall permits, and the application accepts. Without this distinction, blast radius is a hairball and useless.

The third point is the one where most analyses go wrong. If your blast radius query returns 4,000 reachable hosts because every server can ping every other server in the same VLAN, you have produced noise. The useful answer is "every node reachable via the compromised identity's effective permissions in three hops, weighted by the criticality of the destination". That is a much smaller and much more actionable set — often 30 to 200 nodes, not thousands.

Definitional discipline: blast radius is reachability via the compromised identity's effective permissions, not reachability via the underlying network. Treat them as separate queries. The network reachability is the worst case; the identity-weighted reachability is the working case. Most decisions should be made against the working case.

Why CVSS is not blast radius

The most common misuse we see is the CVSS-as-blast-radius confusion. A vulnerability has a high CVSS score; therefore, the team treats the affected machine as a serious incident; therefore, blast radius is "high". This collapses two unrelated concepts.

CVSS measures the inherent severity of a vulnerability — how exploitable it is, what privileges it grants, what impact it has on the affected component. It is property of the CVE. It says nothing about your environment. A 9.8 critical on an isolated workstation in a test lab has a blast radius of roughly nothing. A 6.5 medium on a service account that bridges your CRM, your billing system, and your data warehouse has a blast radius the size of your company.

Asset profileCVE CVSSIdentity reachReal blast radius
Isolated test box, no production data9.8Local only1 node
Build server, signs production artefacts6.5Pushes to all prod~500 production nodes + supply chain
Analyst's laptop, full SSO7.4SSO to ~40 SaaS apps40 SaaS tenants + downstream data
Domain controller, vulnerable to relay8.1Authoritative for whole forestThe forest

CVSS tells you the door is broken. Blast radius tells you what's behind the door. Both matter, but they answer different questions, and using one as a proxy for the other produces consistently bad decisions. We have seen organisations patch high-CVSS, low-blast-radius issues in two days while leaving medium-CVSS, high-blast-radius issues open for six months. The risk model was using the wrong axis.

The one-hop query

If the graph is properly typed, blast radius is a one-hop query. We mean that literally. The pattern:

// blast radius from a compromised entry node `n0`
MATCH path = (n0 {id: $entry_id})-[r*1..3]->(target)
WHERE r.access_type IN ['IDENTITY_REACH', 'CONTROL_PLANE', 'DATA_PLANE']
  AND target.criticality >= 'medium'
RETURN
  target,
  path,
  evidence_for_each_hop(path) AS evidence,
  containment_options(target) AS options
ORDER BY target.criticality DESC, length(path) ASC
LIMIT 500

The query is short because the graph has done the work. Edges are typed by access semantics, not just by network connectivity. Criticality is a node property derived from data classification, ownership, and the asset's downstream consumers. Evidence-per-hop is a function that returns the underlying telemetry that justifies the edge. Containment options is a function that returns the operationally feasible isolation moves for each node — "revoke session", "disable account", "quarantine host", "block egress range", "rotate secret".

If your platform requires you to write fifteen separate queries to produce this result and join them in a spreadsheet, you do not have blast radius as a concept. You have blast radius as a research project.

What containment scope follows from

Containment scope is the operational consequence of blast radius. Containment is expensive — every disabled account is a phone call, every quarantined host is a help-desk ticket, every blocked egress range is a business risk. So containment scope should be the smallest set of actions that meaningfully reduces the attacker's effective reachability.

This is a different query, but it uses the blast-radius set as input. Find the minimum cut: which set of edges, if removed, disconnects the compromised entry from the highest-criticality reachable nodes? In graph terms, this is a min-cut on a weighted reachability subgraph. In operational terms, it is "what is the smallest number of accounts to disable and ports to block to ringfence the problem?"

This is exactly the question your incident commander is asking at hour two. Until recently, the answer came from intuition. Now it can come from the same substrate that produced the detection. We've written up the operational pattern in more depth in the DPDP 72-hour runbook, because containment scope feeds directly into the "mitigation taken" field of the regulatory notification.

The cultural shift that makes this work

The technical change is the easy part. The cultural change is what trips most teams.

Senior analysts often resist a queryable blast radius because their value to the team has historically been the ability to draw the diagram. If the platform draws it for them, what are they being paid for? The honest answer is: for everything after the diagram. Deciding which containment moves to make, communicating to stakeholders, negotiating with business owners about acceptable disruption, sequencing the evidence collection. The diagram-drawing is the least leveraged thing a senior analyst does, and automating it frees their time for the parts of the job that actually need a human.

The other resistance comes from "but what if the graph is wrong?" Fair concern. The mitigation is twofold: every edge in the graph carries provenance — the telemetry that produced it — and analysts can audit any hop on demand. And the blast-radius query should display its confidence as a property of each edge. An edge derived from a TLS handshake captured in network telemetry has high confidence; an edge inferred from configuration drift has medium confidence; an edge inferred from a stale CMDB record has low confidence. The query returns all of them, sorted, so the analyst can decide which hops to trust.

The honest caveat: blast radius queries are only as good as the underlying edge data. If your identity telemetry is missing, your blast radius understates identity reach. If your network telemetry is sampled, your blast radius understates lateral movement. The graph does not invent edges; it traverses what telemetry has given it. Invest in the telemetry seams before you trust the query.

What you measure once you have it

When blast radius is queryable, three new metrics become possible. We track all three.

  • Median blast radius per detection class. For each rule, what is the typical reachable node count at first firing? Detections that consistently produce blast radii of 1 or 2 are likely useless. Detections that consistently produce blast radii of 500+ may be over-alerting on benign infrastructure. The shape of the distribution is the tuning signal.
  • Containment lift. For each containment action taken, what is the reduction in reachable critical nodes? This is the operational ROI of the action. A containment that reduces reachable crown jewels by 80% justifies the disruption. One that reduces by 5% probably doesn't.
  • Blast-radius decay time. From first detection to "no critical nodes reachable from entry", how long does it take? This is the new MTTR-equivalent that actually correlates with breach cost. It is also a metric that legacy SIEMs cannot produce.

The last metric is the one we are increasingly asked about by enterprise security leaders. MTTR-as-traditionally-measured stops the clock when the ticket closes. Blast-radius decay stops it when the attacker's effective reach is back to normal. Those are different events. The gap between them is where post-incident-cleanup-from-hell lives.

Where this stops being optional

If you operate in a regulated sector and a real incident lands, the regulator will ask: what was the scope, how do you know, what did you contain, and how do you know you contained the right things? Each of those questions is a blast-radius question. You can answer them by hand, slowly, with the help of an external forensics firm and a panicked four-week project. Or you can answer them in the meeting itself, from the platform, with a query history that shows your reasoning at each step. We have watched both. The second outcome is dramatically less expensive.

The manufacturing air-gapped plant case study walks through one such scenario where the regulator asked a containment-scope question two weeks after the incident closed, and the answer was reproducible from the graph in under fifteen minutes. The alternative would have been a forensics retainer.

Key takeaways

  • Blast radius is a structural property of the environment, not analyst tribal knowledge. It should be a query, not a sketch.
  • Network reachability and identity-effective reachability are different things. Use the identity-weighted version for operational decisions.
  • CVSS measures the door. Blast radius measures what's behind it. Stop using one as a proxy for the other.
  • Containment scope is a min-cut problem on the blast-radius subgraph — the smallest set of edges that disconnects the entry from the crown jewels.
  • Three new metrics: median blast radius per detection class, containment lift, and blast-radius decay time. The last one is the new MTTR.

One more in this series: why every new detection should replay against the last 90 days of hot data and the last 7 years of cold data before going live — and why most platforms structurally cannot do this.