Skip to main content

Diagnostic Tool

The Agent Litmus Test

Do You Actually Need an AI Agent?

A 12-point diagnostic from Engineering Reliable AI Agents & Workflows

The Problem This Diagnostic Solves

"AI agent" has become the default pitch for every automation project. Invoice processing? Agent. Customer service? Agent. Data extraction? Agent.

Here's the reality: most of these projects don't need agents.

They need well-designed workflows with targeted AI calls at specific steps. Building an agent when a workflow would suffice means:

  • 5-10x higher costs in development and operation
  • Timelines measured in months instead of weeks
  • 70-75% accuracy vs. 90%+ for workflows
  • Permanent human oversight requirements
  • Compliance challenges from non-deterministic behavior

This diagnostic cuts through the hype. In 10 minutes, you'll know whether your use case genuinely requires an agent—or whether you're about to overengineer.

How the Agent Litmus Test Works

Score your use case across 12 binary criteria in three parts. For each statement that is unequivocally true, add one point.

Your total score places you in one of three zones:

The Workflow Zone

Build a deterministic workflow with targeted AI calls

The Hybrid Zone

Workflow first, then surgical agent components

The Agent Zone

Full agent architecture justified

Each zone has a clear architectural recommendation and guidance on what to build.

The Three Assessment Areas

Part 1: Cost & Value

"The ROI Reality" — Is the ROI defensible?
Tests whether your organization can absorb the costs and timelines agents require. Evaluates if business value justifies higher investment—not just in development, but in ongoing oversight and maintenance.

Sample Criterion:

☐ The business value unequivocally justifies a 5-10x increase in cost and complexity.

If you're expecting an agent to cost "about the same" as traditional automation, reassess.

Part 2: Variability & Complexity

"The Boundedness Check" — Is the problem truly unbounded?
Most problems feel infinite but aren't. This section stress-tests whether your use case is genuinely unbounded or just complex. Complex problems with finite input types can be handled by workflows. Only truly unbounded problems justify agent architecture.

Sample Criterion:

☐ The problem space is genuinely infinite; valid inputs cannot be categorized into a finite list of types.

If you can enumerate your input types—even 500 of them—this may not be agent territory.

Part 3: Reasoning & Autonomy

"The Autonomy Trap" — Is true reasoning essential?
Distinguishes genuine reasoning from sophisticated pattern-matching. Many tasks that seem to require "intelligence" work fine with deterministic logic plus AI components. True agent territory requires novel problem-solving for situations never seen before.

Sample Criterion:

☐ Users can tolerate—and easily recover from—occasional hallucinations or nonsensical failures.

Agents hallucinate. That's inherent, not a bug. Your use case must absorb this.

What Your Score Tells You

Your total score (0-12) places you in one of three zones. Each zone includes a clear architectural recommendation and specific guidance on approach.

The complete diagnostic includes:

  • All 12 scoring criteria across three assessment areas
  • Scoring thresholds for each zone
  • Zone definitions with architectural recommendations
  • Printable worksheet for team sessions
  • Notes template for documenting reasoning

Who Should Use This Diagnostic

Product Managers

Scoping AI features and setting realistic expectations

Engineering Leads

Evaluating architecture proposals before committing resources

CTOs

Reviewing AI investments before budget approval

Consultants

Advising clients on the right AI approach

Anyone Being Pitched

Cut through vendor hype with objective criteria

Team exercise:

Works best as a group activity. Disagreements on scores reveal hidden assumptions that need resolution before architecture decisions.

Frequently Asked Questions

What is the difference between an AI agent and a workflow?
An AI agent reasons, makes decisions, and adapts to novel situations. A workflow is a deterministic sequence with targeted AI calls at specific points—extraction, classification, etc. Workflows offer higher reliability and lower costs. Agents handle unbounded problems but require more investment. Most use cases are better served by workflows with surgical AI integration.
How do I know if I need an AI agent?
Score your use case across 12 criteria covering Cost & Value, Variability & Complexity, and Reasoning & Autonomy. Your total places you in one of three zones with clear recommendations. Takes about 10 minutes.
Do most AI projects actually need agents?
Agents are the right choice for less than 10% of business problems. Most are better served by workflows with AI components at specific steps. The Agent Litmus Test helps determine if your problem falls into that rare agent-justified category.
What if I build an agent when a workflow would work?
Expect 5-10x higher costs, month-long timelines instead of weeks, non-deterministic behavior, permanent oversight requirements, and compliance difficulties.
Can I start with a workflow and add agent capabilities later?
Yes—this is the recommended approach. Build the workflow first, validate in production, identify where human judgment is the actual bottleneck, then add agentic components surgically at those points.

Download the Complete Diagnostic

Get the full Agent Litmus Test with all 12 criteria and zone recommendations.

What you get:

  • All 12 scoring criteria across three assessment areas
  • Scoring thresholds for each zone
  • Zone definitions with architectural recommendations
  • Printable worksheet for team sessions
  • Notes template for documenting reasoning

Related Diagnostics

From the Book

This diagnostic is from Engineering Reliable AI Agents & Workflows. The book covers the reasoning behind each criterion, case studies across scoring zones, and what to do after you have your score.

Get the Book →