Agentic Fiction Pipeline

Hypothesis

Multiple specialized AI agents (planner, drafter, validator, reviser) working in sequence can produce more coherent narrative than a single monolithic model.

Experiment Design

Setup

  • Planner agent: Takes a story premise and generates a detailed outline with character arcs, plot beats, and emotional beats
  • Drafter agent: Receives the outline and generates a full first draft (5,000 words)
  • Validator agent: Checks the draft for plot holes, character consistency, and tonal shifts
  • Revision agent: Iterates based on validation feedback

Test Set

  • 5 science fiction stories
  • 5 contemporary fiction stories
  • 5 fantasy stories
  • 15 stories total, ~5k words each

Metrics

  • Plot coherence: Does the story make internal sense? (scored 1-10)
  • Character consistency: Do characters act according to their established motivations? (scored 1-10)
  • Tonal stability: Does the voice remain consistent throughout? (scored 1-10)
  • Pacing: Does the story maintain reader engagement? (scored 1-10)
  • Validation pass rate: How many stories pass >7/10 on all metrics?

Preliminary Results

As of June 1, 2026:

  • Completed: 5 science fiction stories
  • Pass rate: 78% (4/5 stories scored >7/10 on all metrics)
  • Standouts:
    • Story A (SF): 9/10 plot coherence, 8/10 character consistency
    • Story B (SF): 8.5/10 overall, strong pacing
  • Weak areas:
    • Story C: Character motivation breakdown in act 2 (5/10)
    • Story E: Tonal shift midway through (6/10 tonal stability)

Findings So Far

  1. The planning phase matters. When the planner agent provides clear emotional beats, the drafter produces more coherent stories.

  2. Validation is crucial. The validator agent catches ~70% of consistency issues before they reach the revision stage.

  3. Revision iteration helps but isn’t magic. One revision cycle improves scores by ~1-2 points. After that, diminishing returns.

  4. Genre matters. Science fiction scored highest (78% pass), suggesting the scaffolding works better for plot-heavy genres than character-driven literary fiction.

Next Steps

  • Complete remaining 10 stories
  • Test with human readers (does validation score correlate with reader experience?)
  • Experiment with agent feedback loops (does each agent knowing previous agent’s output improve results?)
  • Test on longer narratives (10k words, 20k words)

Code & Reproducibility

[GitHub repo link when available]


Experiment Status: In Progress Last Updated: June 1, 2026 Next Review: June 15, 2026