Abstract visualization of layered decision systems and governed AI workflows

Wells Fargo

Field Study

September 15, 2018

AI GovernanceDecision SystemsEnterprise AI

Designing Human-Governed AI in a Regulated Enterprise

How Wells Fargo operationalized AI experimentation, trust, and decision governance across 17 business units

Context

In 2019, Wells Fargo’s AI Enterprise Solutions organization faced a problem that had little to do with model capability.

The institution had talent. It had data. It had executive interest in artificial intelligence.

What it lacked was operational alignment.

Teams across fraud, operations, compliance, and customer service were attempting to explore machine learning in parallel, often without shared language, governance structure, or clear ownership of risk. Some groups viewed AI as transformational. Others viewed it as a regulatory liability waiting to happen.

Most initiatives stalled long before deployment.

The challenge was not simply designing AI systems.

The challenge was designing organizational trust around them.

This work focused on helping Wells Fargo create repeatable, human-centered frameworks for AI experimentation that could survive inside a highly regulated environment without collapsing under ambiguity, fear, or procedural friction.

The Institutional Problem

Across 17 business units, teams struggled with the same recurring questions:

Who owns model accountability?
What level of interpretability is required?
How should legal and compliance teams participate?
Where does human judgment remain mandatory?
What qualifies as experimentation versus production risk?

Most teams were not blocked by technology.

They were blocked by uncertainty.

The absence of a shared operational framework created hesitation at every stage of the process. Data science teams could build prototypes, but business leaders often lacked confidence in deployment pathways. Compliance teams entered conversations too late. Product discussions drifted into abstract ambition without clear hypotheses, governance boundaries, or measurable decision logic.

The organization did not need more AI enthusiasm.

It needed systems capable of making AI legible.

Designing Trust Before Scale

The work began with a simple premise:

AI adoption inside regulated environments depends less on raw model capability than on whether institutions can understand, govern, and intervene in the decision process.

This shifted the focus away from speculative product thinking and toward operational design.

Discovery sessions were conducted with stakeholders across legal, fraud, customer operations, compliance, product, and data governance. Rather than treating governance as a downstream constraint, governance became part of the design process itself.

The goal was to reduce ambiguity before scaling experimentation.

Several core principles emerged:

Human override must remain visible and intentional
Interpretability matters as much as predictive accuracy
Escalation pathways should be designed before deployment
AI pilots should be framed as testable hypotheses, not executive mandates
Shared language reduces organizational resistance

These principles became foundational across multiple initiatives.

The AI Sprint Kit

One major initiative involved designing a repeatable framework for AI discovery and experimentation across business units.

The resulting system became known internally as the AI Sprint Kit.

The kit introduced structured workflows that helped teams move from vague interest in AI toward testable operational concepts grounded in governance, feasibility, and user value.

Core components included:

AI Framing Canvas

A structured framework for defining:

problem space
expected model behavior
data readiness
operational dependencies
human checkpoints
decision ownership

Stakeholder Alignment Maps

Visual systems that clarified who owned:

data
decisions
escalation authority
compliance review
post-launch accountability

Scientific Method Playbook

A hypothesis-driven structure that reframed AI initiatives as measurable experiments rather than aspirational product narratives.

Trust and Feedback Loops

Embedded review checkpoints designed to preserve interpretability, auditability, and post-pilot reflection.

The system reduced confusion between executive expectations and engineering execution while helping teams establish safer experimentation pathways.

Pilot groups reduced average time from concept to testable initiative by approximately sixfold.

More importantly, the organization began developing a shared operational language around AI.

Human-in-the-Loop Governance

Another major area of focus centered on human oversight inside AI-assisted workflows.

Rather than positioning machine learning systems as autonomous decision-makers, the work emphasized AI as a support layer for human judgment.

This became especially important in customer operations environments where language, urgency, and compliance risk intersected in real time.

The NLP initiative focused on operationalizing insight from unstructured customer communication including:

support notes
chat transcripts
escalation reports
customer service interactions

Several interaction patterns emerged as critical.

Inline Confidence Scores

Prediction confidence and sentiment scoring remained visible to users rather than hidden behind opaque automation.

Feedback and Correction Modes

Frontline operators could validate, reject, or correct model outputs directly inside the workflow.

This transformed operational usage into an ongoing learning system rather than a static deployment.

Escalation Signal Layers

Language patterns associated with compliance risk, customer retention concerns, or operational urgency were surfaced proactively to support earlier intervention.

The resulting workflows reduced average response lag for high-risk customer communications while improving user confidence in triage prioritization.

The broader lesson was clear:

Trust does not emerge from automation alone.

Trust emerges when people understand how decisions are being shaped and retain meaningful authority within the system.

Cross-Functional Alignment as System Design

One of the most important outcomes of the engagement had little to do with interfaces.

The deeper challenge involved aligning organizational incentives across teams that historically operated in isolation.

Legal teams evaluated exposure.

Data science teams optimized for model performance.

Product groups prioritized delivery velocity.

Operations teams focused on stability and escalation handling.

Without a shared framework, every AI conversation fragmented.

The work introduced collaborative rituals and governance structures that allowed these groups to participate earlier and more constructively in the design process.

This changed the institutional posture toward experimentation.

Several dormant machine learning initiatives that had previously stalled were eventually reevaluated and approved for deployment once governance and interpretability structures became clearer.

The cultural shift was subtle but important.

The organization moved from:

“We cannot do this safely.”

toward:

“Here is how we test this responsibly.”

That distinction changed the trajectory of adoption.

Outcomes

The collective initiatives contributed to measurable operational and organizational outcomes:

AI sprint adoption across 17 business units
Faster movement from concept to pilot validation
Increased confidence in AI-assisted triage workflows
Improved interpretability and governance visibility
Stronger collaboration between compliance, product, and data science teams
Re-evaluation and deployment of previously stalled ML initiatives
Reduction in response lag for high-risk operational communications

More importantly, the work helped establish institutional muscle memory around governed experimentation.

The systems were not designed to remove human judgment.

They were designed to support it under pressure.

Reflections

Large institutions rarely fail because they lack intelligence.

They fail because decision-making becomes fragmented across systems, incentives, and organizational boundaries.

AI amplifies that problem if governance, accountability, and interpretability are treated as secondary concerns.

The work at Wells Fargo reinforced a principle that continues to shape my approach today:

The most important layer in enterprise AI is often not the model itself.

It is the layer where human beings decide whether the system deserves trust.

That layer must be designed deliberately.

Otherwise, even sophisticated systems collapse under institutional uncertainty.

The future of enterprise AI will not be defined solely by capability.

It will be defined by whether organizations can build systems where people remain capable of understanding, questioning, and governing the decisions being made around them.

█

Subscribe to Amid the Noise

Amid the Noise is an ongoing body of work on signal, systems, governance, AI, and the structures that shape human judgment under pressure.