Why the graph is the product, not a feature

If the security knowledge graph is bolted on top of a SIEM, you have a dashboard. If it is the substrate, detection becomes traversal and half your tool stack collapses.

Every vendor has a graph now. Open any product brochure and there is a node-and-edge diagram on slide six, usually labelled "context graph" or "attack graph" or, lately, "agentic graph". Practitioners click around for a while, watch the spider-web wobble, and then go back to the alert queue. The graph view ends up living in a quarterly review deck and nowhere else.

The problem isn't that graphs are useless. The problem is that almost every vendor builds the graph as a feature: a read-only visualisation derived from the SIEM, enriched once an hour, sitting next to forty other tabs. The graph is downstream of the data. That is precisely backwards.

The position we hold, and the one Netgraph was built on, is simpler. The graph is not a feature on a SIEM. The graph is the SIEM. Every event, every asset, every identity, every CVE, every policy, every control is a typed node from the moment it enters the pipeline. Detection, response, hunt, exposure management and compliance are all traversals over the same substrate. When you do that, the rest of the product gets dramatically smaller and dramatically more useful.

The bolted-on graph anti-pattern

Here is what a bolted-on graph looks like in production. Data lands in a log lake. An ETL job runs every fifteen minutes, picks out the IPs and usernames and hostnames, and writes them into a graph database. A UI sits on top. Analysts can right-click an alert and "view context". The context view shows a hairball with two hundred nodes, no obvious path, and a loading spinner that takes thirty seconds to render the second hop.

This pattern fails for three structural reasons, none of which are about UI.

The graph is stale. By the time the ETL is done, the SOC has already moved on to triaging the next ticket. Decisions are being made on data that is, in the best case, fifteen minutes old. In a credential-stuffing scenario, that's three thousand attempts ago.
The graph is lossy. Only a handful of fields are promoted to nodes. Everything else stays in the log lake. Which means the moment an analyst asks "did this asset talk to that one over a control-plane port at 02:00 last Tuesday?", you're back to writing a search query in the original SIEM. The graph adds a tab; it doesn't reduce work.
The graph has no schema discipline. A "user" node from the identity provider, a "user" from the EDR, and a "user" from the email gateway are three different nodes. Merging them is an afterthought. So traversals lie.

What you end up with is a dashboard with a fancier widget. The detection logic still lives in correlation rules written in a vendor-specific search language. The graph never gets to participate in the actual decision.

The tell: ask the vendor how detection rules are expressed. If the answer is "you write a query, then optionally pivot to graph view", the graph is downstream. If the answer is "rules are written as graph patterns and matched continuously as nodes and edges arrive", the graph is the substrate. Everything else is decoration.

What changes when the graph is the substrate

Treat the graph as the storage and query model from day zero and three things collapse simultaneously: the correlation engine, the asset inventory, and the enrichment layer. They all become the same component. You stop owning three subsystems that have to agree with each other; you own one.

Detections become patterns, not searches

A search-based detection says: "find events where field X equals Y in the last 5 minutes." That's a filter, not a detection. A graph-pattern detection says: "find a user node that authenticated from a new ASN, then within 8 minutes touched a service-account node that has cross-tenant scope, and within another 12 minutes that service-account opened a connection to an asset tagged crown-jewel". That is a story. It also happens to be expressible in roughly ten lines.

MATCH
  (u:User)-[:AUTH {asn_first_seen: true}]->(d:Device),
  (u)-[:ASSUMED_WITHIN {gap_lt: "PT8M"}]->(sa:ServiceAccount {cross_tenant: true}),
  (sa)-[:CONNECTED {gap_lt: "PT12M"}]->(a:Asset {tag: "crown-jewel"})
WHERE u.risk_score > 60
RETURN u, sa, a, path_evidence(u, a)

The point isn't the syntax. The point is that the same query expresses the detection, the correlation, the kill-chain stitching, and the evidence package. There is no second tool. There is no "now go look it up in the SIEM". The path itself is the alert, and the alert ships with its blast radius pre-computed because every reachable node from a is one more hop away.

Enrichment is not a pipeline stage, it's a node merge

In a traditional pipeline, enrichment is a separate batch job that adds fields to events. CVE data, GeoIP, threat intel, asset criticality — all of it joined in by reference. The cost is invisible until you try to backfill a year of logs with a new field. Then it becomes a six-week project.

In a graph-native model, enrichment is a node merge. The CVE for CVE-2025-XXXXX is one node, in one place. Every asset running the affected component has an edge to it. Update the CVSS, the EPSS, the known-exploited flag, and every traversal that touches that node sees the new value instantly. No backfill. No "wait for the enrichment job to catch up".

The asset inventory is not a separate product

Most CMDBs are graveyards. They exist because the SIEM couldn't answer "what is this host", so a separate team built a separate database, and now there are three sources of truth and a reconciliation tool that nobody trusts. When the graph is the substrate, the asset is just a node with edges to its identity, its CVEs, its data classification, its owner, and its observed behaviour. The inventory is the live view of the graph, filtered to asset-typed nodes. There is nothing to reconcile because there is nothing to reconcile against.

Where the graph has to earn it

None of this works if the graph is slow. A two-hop traversal that takes thirty seconds is not a substrate; it is a museum exhibit. The graph has to honour roughly three constraints to be the actual primary store.

Constraint	Target	Why it matters
Ingest latency	Sub-second from event to query-visible node/edge	Detections must fire on fresh state, not a 15-minute-old snapshot.
Traversal p95	Under 800ms for 3-hop on the hot tier	Analysts will not wait. Beyond ~1s the workflow degrades to "let me check the dashboard".
Schema evolution	Online; no rebuild	Detection engineers add node/edge types weekly. A rebuild kills velocity.
Cold replay	7+ years queryable	Retrospective detection against historical data is the whole point of a long-tail store.

The architecture trick is that the hot tier and the cold tier expose the same graph API. Analysts don't learn two query languages. Detection content doesn't ship in two flavours. There is one logical graph; physical placement is a tier concern, not a user concern. We've written about how this hot/cold split actually works in the graph-native correlation whitepaper — worth reading if you want the storage model rather than the philosophy.

The counter-argument, fairly stated

It would be dishonest to pretend the graph-first model is free. Three honest objections:

Graphs are harder to operate at scale than columnar tables. True. Which is why most vendors don't try, and the ones who do tend to keep the graph small. Netgraph's bet is that with a typed schema and a tiered store, the operating cost is comparable to a SIEM plus a CMDB plus an enrichment service plus a UEBA — i.e. what you were already paying.
Graph query languages are unfamiliar. Also true. We mitigate this by giving the detection engineer two surfaces: a graph pattern language for the substantive rules, and a SQL-flavoured projection for ad-hoc analytics. About 70% of detections end up as graph patterns; the rest are simple filters that don't need traversal.
Visualisations of large graphs are useless. Mostly true, and we agree. Netgraph almost never shows you a 500-node hairball. We show paths. A path is a graph projection that fits on a screen and reads like a sentence: user → device → identity → service → asset. That is the only graph picture an analyst ever wants to see.

Where bolted-on graphs sometimes do win: if your SOC's daily work is overwhelmingly single-source single-event triage (i.e. a managed EDR queue), a real graph substrate is overkill. Buy a good EDR, write decent playbooks, and skip the graph entirely. Substrate-grade graphs earn their cost when you have three or more telemetry sources that have to be reasoned about together.

What this looks like in the wild

The cleanest example we can share publicly is from the private-bank deployment in the BFSI case study. Before the migration, "investigate a suspicious authentication" involved the SIEM, the identity log, the EDR console, the firewall log, and a spreadsheet of crown-jewel assets. The L1 analyst's median time to "I have enough to escalate" was forty-three minutes.

After the migration, the same investigation was a single traversal that returned the user, the device, the assumed roles, the touched assets, the data classifications, the open tickets on those assets, and the relevant CVE exposure. Median time to escalate dropped to seven minutes. The L1 didn't get smarter. The data stopped being spread across five tools.

That number — seven minutes — is not a graph database benchmark. It's the side effect of making detection, context and remediation share the same noun. The graph wasn't a feature on top of the SIEM. The graph replaced the SIEM, the CMDB, the enrichment layer, and the UEBA. Four tools became one logical thing.

Key takeaways

A graph as a feature is a dashboard with a wobbly diagram. A graph as the substrate is the primary store, the detection engine, and the inventory in one.
Detections become path patterns. The pattern is the rule, the correlation, the evidence and the blast-radius query — all at once.
Enrichment stops being a pipeline stage and becomes a node-merge. CVEs, criticality, and identity context are first-class nodes, not bolted-on fields.
The non-negotiable engineering constraints: sub-second ingest, sub-second 3-hop traversals on hot, and a unified API across hot and cold tiers.
If your day is single-source EDR triage, you don't need this. If you reason across three or more telemetry sources, the graph stops being a luxury.

Next week we'll talk about a number nobody on your SOC measures: correlation debt. It's the reason MTTD charts look fine while breach cost keeps climbing.

Why the graph is the product, not a feature

The bolted-on graph anti-pattern

What changes when the graph is the substrate

Detections become patterns, not searches

Enrichment is not a pipeline stage, it's a node merge

The asset inventory is not a separate product

Where the graph has to earn it

The counter-argument, fairly stated

What this looks like in the wild

Key takeaways

Continue reading

MTTD vs correlation debt

Graph-native correlation

Private bank, four tools to one