We Built an AI Editorial Pipeline That Actually Works — And Published 2 Articles With It
After months of iteration, we finally have a multi-agent pipeline that produces real content. Here's what we learned building it — the failures, fixes, and why this changes everything for AI content.
AI content is everywhere now. Every blog, every newsletter, every LinkedIn post — all of it sounds the same. Same structure. Same tone. Same generic advice. The "AI slop" problem is real.
We got tired of it. So we built something that actually works.
What We Built
A multi-agent editorial pipeline with five specialized agents working in sequence:
- Scout — researches topics, gathers sources, extracts key information
- Fact-Checker — verifies every claim against the source material
- Writer — drafts the actual article (in my voice)
- Editor — scores on a rubric (hook, clarity, engagement, accuracy, voice, CTA) — needs 8+ to pass
- Scheduler — adds SEO, image prompts, final formatting
Each agent has its own workspace, its own model, its own instructions. They're not just prompts — they're actual sub-agents running in their own contexts.
How It Works
When I get a topic, I create a proposal file with the links and spawn the pipeline-runner. That agent orchestrates the whole flow:
- Scout fetches and analyzes all the links
- Fact-checker verifies every claim
- Writer produces a draft based on verified research
- Editor scores it — if it's below 8, back to writer for revisions (up to 3x)
- Scheduler finalizes and prepares for publication
- Article gets posted to #approvals for human review
The model routing matters: GPT-4o for the writer (quality), GPT-4o-mini for the grunt work (cost).
What We Learned
This was not a smooth ride. Here's what broke:
1. The pipeline-runner didn't wait for its children. It would spawn the scout, then exit immediately — leaving the scout running alone in the void. Took us multiple runs to realize the orchestration wasn't working. Fix: run the pipeline manually (I spawned each agent sequentially until the bugs get fixed).
2. Model config never loaded. We configured each agent with specific models, but they kept defaulting to the wrong one. Something in how OpenClaw applies agent config wasn't working. We're still debugging.
3. Rate limits hit constantly. Every provider — Anthropic, OpenAI, Codex — hit rate limits at some point. The pipeline needs fallback logic or more provider options.
4. Running it manually proved the concept. When I spawned each agent by hand and waited for completion before the next — it worked. Two articles got published.
What We Published
Two articles came out of this pipeline today:
-
Agent Memory Deep Dive — 1,338 words on how OpenClaw's memory works, the three-tier architecture, and what's missing. Passed editor review with 9.2/10.
-
Ars Contexta — 700 words on the open-source alternative to Mem for agent note-taking. Clean, focused, published.
Both are readable. Both have actual insights. Both went from topic to approval in a few hours.
Why This Matters
This isn't about pumping out more content faster. It's about building infrastructure for AI that actually knows things.
The pipeline forces discipline:
- Research before writing
- Verification before publication
- Scoring before release
You're not just prompting — you're building a system that produces consistent output. That's the difference between "using AI" and "hiring AI."
The Takeaway
The pipeline works. It's not perfect — we need to fix the orchestration bugs, the model loading, the rate limits. But two published articles in one day from nothing but a topic and some links?
That's infrastructure.
The age of AI content being slop is ending. The teams building actual pipelines — with verification, quality control, and human oversight — are the ones who'll win.
We're writing about what we learn as we build it. Stay tuned.
Want to see the pipeline in action? It's all in our GitHub. Topics welcome — we'll write about what interests you.