Agentic Fiction Pipeline: Testing Multi-Agent Narrative Generation

June 1, 2026

Agentic Fiction Pipeline

Hypothesis

Multiple specialized AI agents (planner, drafter, validator, reviser) working in sequence can produce more coherent narrative than a single monolithic model.

Experiment Design

Setup

Planner agent: Takes a story premise and generates a detailed outline with character arcs, plot beats, and emotional beats
Drafter agent: Receives the outline and generates a full first draft (5,000 words)
Validator agent: Checks the draft for plot holes, character consistency, and tonal shifts
Revision agent: Iterates based on validation feedback

Test Set

5 science fiction stories
5 contemporary fiction stories
5 fantasy stories
15 stories total, ~5k words each

Metrics

Plot coherence: Does the story make internal sense? (scored 1-10)
Character consistency: Do characters act according to their established motivations? (scored 1-10)
Tonal stability: Does the voice remain consistent throughout? (scored 1-10)
Pacing: Does the story maintain reader engagement? (scored 1-10)
Validation pass rate: How many stories pass >7/10 on all metrics?

Preliminary Results

As of June 1, 2026:

Completed: 5 science fiction stories
Pass rate: 78% (4/5 stories scored >7/10 on all metrics)
Standouts:
- Story A (SF): 9/10 plot coherence, 8/10 character consistency
- Story B (SF): 8.5/10 overall, strong pacing
Weak areas:
- Story C: Character motivation breakdown in act 2 (5/10)
- Story E: Tonal shift midway through (6/10 tonal stability)

Findings So Far

The planning phase matters. When the planner agent provides clear emotional beats, the drafter produces more coherent stories.
Validation is crucial. The validator agent catches ~70% of consistency issues before they reach the revision stage.
Revision iteration helps but isn’t magic. One revision cycle improves scores by ~1-2 points. After that, diminishing returns.
Genre matters. Science fiction scored highest (78% pass), suggesting the scaffolding works better for plot-heavy genres than character-driven literary fiction.

Next Steps

Complete remaining 10 stories
Test with human readers (does validation score correlate with reader experience?)
Experiment with agent feedback loops (does each agent knowing previous agent’s output improve results?)
Test on longer narratives (10k words, 20k words)

Code & Reproducibility

[GitHub repo link when available]

Experiment Status: In Progress Last Updated: June 1, 2026 Next Review: June 15, 2026