Root cause analysis and risk decisions {.featured-image}

Root Cause Analysis: The Hidden Risk Decisions We Never Knew We Made

What if the hardest root causes to analyze are those buried in risk decisions we never understood we were making?

The Symptom Problem

Most cybersecurity “root cause” analyses stop at symptoms because they miss the implicit risk acceptances hidden in everyday business decisions. Every “trade-off” in budget discussions, every “we’ll fix it later” in sprint planning, every “do more with less” in resource allocation—these aren’t just business decisions, they’re unacknowledged risk acceptances that become tomorrow’s incidents.

The Language That Obscures Risk

Consider how we frame these choices:

When we choose the cheaper vendor, we call it cost optimization—but aren’t we implicitly accepting security risks?
When we defer redundancy funding, we celebrate efficiency—but aren’t we accepting single points of failure?
When we rush deployments, we praise agility—but aren’t we trading safety for speed?

The language we use obscures the risk dimension of these decisions.

The Competence Gap

The challenge runs deeper than awareness. Strategic root cause analysis demands competencies most security teams haven’t developed:

Systems thinking to understand cascading effects
Organizational psychology to recognize behavioral patterns
Power dynamics awareness to trace decision influence
Budget politics understanding to follow resource allocation

It’s intellectually demanding work that crosses disciplines. Even when we have these skills, tracing an incident like “admin misconfigured cloud storage” back through layers of implicit decisions—understaffing, inadequate training budgets, choosing speed over verification processes—requires both organizational courage and political capital that few possess.

The Three Levels of Root Cause

Operational Level: Technical Symptoms

At the surface, we find:

“Human error”
“Misconfiguration”
“Inadequate access controls”
“Missed patch”

But isn’t human error just an outcome of system design? Predictable failures in systems that enable mistakes aren’t random events—they’re inevitable results of how we structured work.

Tactical Level: Process Failures

Digging deeper, we identify:

“Inadequate training”
“Poor processes”
“Insufficient tools”
“Lack of automation”

These are closer to root causes, but still symptoms of deeper problems. Why was training inadequate? Why were processes poor? The answers lie at the strategic level.

Strategic Level: Implicit Risk Decisions

At the root, we discover that most critical risk decisions were never explicitly made. They’re implicit in:

Vendor selections that prioritized cost over security capability
Organizational structures that separated security from operations
Project priorities that always delayed security work
Budget allocations that chronically understaffed security teams
Timeline pressures that normalized cutting corners

These seemed like pure business decisions at the time. No one said “we accept the risk of inadequate security.” But that’s exactly what happened.

The Courage Problem

Until we develop both the competencies and the courage to surface these implicit risk acceptances, we’ll keep producing the symptom-focused reports that fail to prevent the next incident.

Making implicit decisions explicit is terrifying because it reveals:

Budget trade-offs that seemed reasonable but created vulnerabilities
Leadership priorities that systemically undervalued security
Cultural norms that rewarded speed over safety
Organizational structures that guaranteed coordination failures

A Real Example: The Configuration Error

Operational root cause: “Administrator misconfigured cloud storage bucket permissions”

Tactical root cause: “Inadequate training on cloud security and insufficient peer review process”

Strategic root cause:

Budget allocated three cloud administrators for 50+ applications
Training budget cut to meet quarterly targets
Peer review process eliminated to “move faster”
Security team excluded from cloud architecture decisions
Vendor selected based on cost, not security capabilities
Deployment timeline compressed to meet executive promises

Each of these was a risk decision made in the language of business optimization. No one wrote “we accept the risk of data exposure.” But every one of these choices implicitly made that acceptance.

The Context Problem

Root causes are so contextual that our industry struggles with the very concept. What’s a “root cause” in one organization is a downstream effect in another. The misconfiguration that caused one incident was enabled by understaffing, which was driven by budget constraints, which reflected leadership priorities, which stemmed from market pressures.

Where do you stop calling something a “root cause”? When do you accept that some causes are beyond your organization’s control to change?

What This Means for You

For Security Leaders

Stop accepting “human error” or “process failure” as root causes. Push the analysis deeper:

What business decisions enabled this failure?
What trade-offs were made that we didn’t recognize as risk decisions?
What implicit risk acceptances need to be made explicit?

For Executives

The root causes of security incidents often trace back to business decisions you made—or failed to make:

That vendor selection where security was item #7 on the evaluation
That budget cycle where security training was “nice to have”
That project timeline that compressed testing to “just get it done”
That organizational structure that isolated security from operations

For Risk Managers

Start documenting the implicit risk acceptances:

Budget decisions that affect security capability
Timeline pressures that compromise verification
Resource constraints that limit coverage
Vendor selections that trade security for cost

When incidents happen, you’ll have the trail of decisions that led there.

The Hard Question

Perhaps the real question isn’t why we lack good root cause analysis, but whether our organizations are ready to see risk decisions they never realized they were making.

Are you prepared to trace that “human error” back through years of budget cuts, organizational politics, and executive priorities? Are you ready to tell leadership that the root cause of this incident was decisions they made three years ago?

That’s where real root cause analysis lives—and why it’s so rarely done.

Bottom Line

Until we develop both the competencies to trace incidents to their strategic origins and the courage to surface implicit risk acceptances, we’ll keep producing symptom-focused reports that satisfy auditors but prevent nothing.

Real root cause analysis is organizational archaeology—digging through layers of decisions, trade-offs, and priorities to find where risk acceptance was buried in business-as-usual operations.

The organizations that master this archaeology are the ones that actually learn from incidents. Everyone else just keeps finding “human error.”

What Implicit Risk Decisions Have You Made?

Look at your last major security decision. How many business trade-offs were involved? How many of those trade-offs were explicitly framed as risk acceptances?

If the answer is “none,” you’re making implicit risk decisions. The question is whether you’ll discover them before or after they become your next incident’s “root cause.”

Inspired by: Research on root cause analysis in cybersecurity by Sarah Fluchs and the broader industry discussion on the scarcity of actionable root cause reports.

Originally published: LinkedIn

Connect: Follow for more insights on risk management and incident response on LinkedIn • Mastodon • Bluesky

Sten Eikrem - Information Security & Risk Management

Explorer

Root Cause Analysis: The Hidden Risk Decisions We Never Knew We Made

Root Cause Analysis: The Hidden Risk Decisions We Never Knew We Made

The Symptom Problem

The Language That Obscures Risk

The Competence Gap

The Three Levels of Root Cause

Operational Level: Technical Symptoms

Tactical Level: Process Failures

Strategic Level: Implicit Risk Decisions

The Courage Problem

A Real Example: The Configuration Error

The Context Problem

What This Means for You

For Security Leaders

For Executives

For Risk Managers

The Hard Question

Bottom Line

What Implicit Risk Decisions Have You Made?

Graph View

Table of Contents

Backlinks