forge — self-portrait of a thinkpiece pipeline

Voice model: paul-v7d-qwen35-27b (LoRA r=64 on Qwen3.5-27B). Orchestrator: Claude Opus 4.7 (1M context). Hardware: DGX Spark GB10, 128GB unified LPDDR5X, sm_121 CUDA. Last updated: 2026-05-01.

This page is drawn by the agent that built it. Sister project scribe ships fiction; forge ships thinkpieces. Both consume the same voice checkpoint via the _coordination/ deaddrop. Architecture inspired by scribe's portrait, adapted to longform-essay shape.

What this is

Forge is the side of the dual pipeline that produces ~5,000-word thinkpieces in the voice of Paul Logan. Pieces ship to LinkedIn, Reddit, and Substack. The voice model (V7D) is the same checkpoint scribe consumes for novel chapters — a single source-of-truth LoRA pinned via config/voice_pin.yaml and verified by scribe-canonical adapter_sha on every restart.

End-to-end pipeline

flowchart TD Q[runs/queue/blog_ideas.jsonl
operator-curated] --> Pull[queue.py pull
highest-priority queued idea] Pull --> Driver[scripts/thinkpiece/run_thinkpiece.py] subgraph V7D_GEN[V7D generation] Driver --> Outline[Stage 1: outline
~80s, 5 sections] Outline --> Sections[Stage 2: 5 sections
w/ critic+retry loop] end Sections --> Stitch[stitch + clean_output
strip em-dashes, fix mojibake] Stitch --> Md[thinkpiece.md
~5000w, 100% qpass target] Md --> Images[gen_images.py
Stability AI metaphor prompts
4 candidates per slot] Images --> Picks[image_picks.md
operator-in-loop review] Md --> LinkCheck[link_check.py
HEAD-check all URLs] LinkCheck --> PostAll[post_all.py] subgraph POST[post pipeline] PostAll --> LI[post_linkedin.py
/v2/ugcPosts API] PostAll --> Reddit[post_reddit.py
PRAW] PostAll --> SubNotes[post_substack_notes.py
reverse-eng API] PostAll --> SubBrowser[post_substack_browser.py
Playwright grey-hat] end Md --> Metrics[aggregate_metrics.py
+ count_voice_artifacts.py] Metrics --> Eval[EVAL_REPORT.md
VALIDATION.md] classDef forge fill:#fff3e8,stroke:#d96b1c classDef coord fill:#f3e8ff,stroke:#7a3fa6 classDef post fill:#e8f0ff,stroke:#2563b8 classDef op fill:#fffbe0,stroke:#c4a000 class V7D_GEN,Driver,Outline,Sections,Stitch,Md forge class POST,LI,Reddit,SubNotes,SubBrowser post class Q,Picks op

Per-section state machine

Each section runs through a critic+retry loop modeled on scribe's scene-kick recovery + donts critic patterns. Up to 3 attempts per section; ship best by lowest tic-count.

stateDiagram-v2 [*] --> BUILD_PROMPT BUILD_PROMPT --> V7D_POST: section title + outline + last-paragraph recap V7D_POST --> CLEAN_OUTPUT: ~120s wall CLEAN_OUTPUT --> STRIP_LEADING_H2: postprocess.clean_output
(em-dashes, mojibake) STRIP_LEADING_H2 --> COUNT_TICS: handles V7D body-title echo COUNT_TICS --> RETRY: tics > 0 AND attempts < max COUNT_TICS --> QUALITY_GATE: tics == 0 OR attempts == max RETRY --> BUILD_PROMPT: inject ban-list of literal phrases V7D used QUALITY_GATE --> SHIP: words >= 300 AND no repeat QUALITY_GATE --> RETRY_OUTER: qpass failed RETRY_OUTER --> BUILD_PROMPT: outer retry (default 1x) SHIP --> [*]

What each piece does

The LLM-ism filter (the new thing)

V7D produces strong Paul-voice prose ~90% of the time. The remaining 10% is concentrated in 4 verbal tics inherited from training data. The pipeline now detects these mechanically and re-rolls.

On detection, retry prompt injects the literal phrase V7D wrote with explicit instruction to rewrite without it. Up to 2 retries. Best attempt = lowest tic count. Validated against batch v1 worst-tic case (roman_citizenship: 14 tells → v3 roman_roads section 1: 1 tic on attempt 1).

Voice fidelity scoreboard (batch v1, 2026-05-01)

V7D's strongest register: AI/LLM contrarian (5.00/5 on agentic_ai_marketing). Weakest: history+meta (4.40/5 on roman_citizenship — slight closing hedge). Spread 0.60 — V7D holds across 5 distinct registers with no register-collapse.

file	role
`scripts/thinkpiece/run_thinkpiece.py`	AgentWrite 2-stage driver. Outline + 5 sections + critic+retry loop + paragraph-echo detection.
`scripts/thinkpiece/gen_images.py`	Stability AI hero+section image gen with per-piece metaphor map (HERO_METAPHORS + SECTION_METAPHOR_HINTS). Falls back to generic editorial prompt for unmapped pieces.
`scripts/thinkpiece/count_voice_artifacts.py`	Mechanical voice scanner. Counts em-dash variants, clichés, LLM-tells (12 banned patterns).
`scripts/thinkpiece/aggregate_metrics.py`	Per-piece + batch metrics from `run.jsonl` + `voice_fidelity.json`. Writes `METRICS.json`.
`scripts/thinkpiece/queue.py`	Blog ideas queue manager. `list/show/pull/done/block/add` ops on `runs/queue/blog_ideas.jsonl`.
`scripts/post/post_linkedin.py`	LinkedIn UGC Post API (OAuth, w_member_social). Adapts piece to 1300-char hook + link.
`scripts/post/post_reddit.py`	PRAW script-app. Submits 40k-char self-post to target subreddit.
`scripts/post/post_substack_notes.py`	Substack Notes (short ~480-char teaser + canonical link) via reverse-engineered API.
`scripts/post/post_substack_browser.py`	Playwright browser automation for full Substack post (no official API as of 2026-04). Operator-authorized grey-hat.
`scripts/post/post_all.py`	Cross-platform orchestrator. Dry-run by default; `--confirm` fires.
`scripts/rag/link_check.py`	Pre-post URL health gate. HEAD-checks all markdown links + plain URLs. Caches 24h. Fails on 4xx/5xx.
`src/forge/postprocess/__init__.py`	Egress contract v1.0.1 — strip `<think>`, fix mojibake, replace em-dashes with commas. Same module scribe consumes.
`config/voice_pin.yaml`	Voice model contract. Production pin = V7D (manifest_digest `ae32ff9c…`). Pre-commit hook auto-ships pin bumps to scribe via deaddrop.

banned pattern	why
`I want to be clear / I want to be honest`	Hedge announcing softening. Paul-voice rule: state things directly.
`the truth is / the reality is / in many ways`	Filler. Removable scaffolding.
`That's the thing / pattern / point / core / whole`	V7D section-ender re-anchor tic.
`Here's what / Here is what / Here's the thing`	Setup phrase that V7D leans on instead of just describing the thing.
`The question is whether / how / why`	Rhetorical-question signpost. Once per piece OK; 4× in one section is a tic.
`more importantly / interestingly / crucially / importantly / essentially / basically / fundamentally`	Filler adverbs.

metric	value
pieces shipped	5
total words	25,342
V7D wall (sequential, max-num-seqs 2)	57.4 min (~11.5 min/piece)
qpass overall (forge.postprocess.quality_gate)	100% (25/25 sections)
voice fidelity mean (Claude judge, 6 axes)	4.72 / 5
em-dash count	0
cliché count (initial scan)	2.2 mean / piece
LLM tells (extended scan)	9.2 mean / piece
image API spend	$4.92 of $5 cap (188 calls)

Coordination with scribe

Hard rules (lifted from CLAUDE.md, broken twice in April 2026)

What's queued

id	status	stub	register
`idea-001`	queued	AI is making the noise-to-signal ratio untenable: a research roundup	AI/cultural-contrarian

Next research piece. Sources: curl bug-bounty closure, writing-competition shutdowns, openclaw fiasco. Will exercise the link_check.py + footnote-renderer (Phase 2 of docs/rag_citation_design.md).

What's still raw

Source

drawn by claude. validated by paul. lives in the same git repo as the pipeline that wrote it.

area	state
RAG citation pipeline	Phase 1 only — link-check works; Phase 2 (footnote renderer + source DB) deferred to v2 batch.
FA2 install in `venv_cu130`	deferred — would 5× train + serve speed; risky compile.
eugr/spark-vllm-docker switch	deferred — bigger win than per-flag tuning.
V7E SFT data prep	P2 — filter "here's what" / "that's the thing" patterns from training set; only path to drive persistent tics to 0.
LinkedIn comment-link workaround	designed not built — secondary `/v2/socialActions/{urn}/comments` POST.
GitHub Pages deploy of this page	manual — operator action.

📚 forge — self-portrait of a thinkpiece pipeline