`/transcribe`

For input `~/Downloads/sales-workshop.mp4` with `--topic hormozi --slug sales-workshop`:

Placeholders like ACME Agency, <id> and you@example.com mark values that are per-agency — your install fills them with YOUR clients and accounts. If a section references a helper script you don't have yet, it ships with that workflow's install.

Transcribe — Long-Form Video/Audio → Knowledge Base

Triggers

/transcribe <file> --topic <topic>
"transcribe this video / audio / workshop / interview / audiobook"
"ingest this Hormozi video / Brunson talk / sales training into the knowledge base"
"add this to the hormozi knowledge base"

What it produces

For input ~/Downloads/sales-workshop.mp4 with --topic hormozi --slug sales-workshop:

.claude/context/hormozi/
├── README.md                     # Index — script appends a row here
├── transcripts/
│   └── sales-workshop.md         # Verbatim, segment-timestamped (Whisper)
├── synthesis/
│   └── sales-workshop.md         # Structured synthesis (Sonnet, mimics existing book format)
└── slides/sales-workshop/        # Only if --with-slides
    └── slide_0001.png ...

The transcript file is archival — large, never auto-loaded into agents. The book file is what agents Read on demand for retrieval. Match this distinction when deciding what flags to pass.

Inputs accepted

Local file path: .mp4, .mkv, .mov, .m4a, .mp3, .wav, anything ffmpeg reads
--topic <topic> (required) — folder under .claude/context/. Existing topics: hormozi, copywriting-refs, media-buying-refs. Create a new one if needed.
--slug <slug> (optional) — output basename. Defaults to sanitized input filename.
--with-slides — run scene detection + Claude vision captions. Default OFF. Turn ON for slide-deck-heavy content (workshops with frameworks/diagrams). Turn OFF for talking-head, interview, podcast, audiobook.
--no-synthesis — skip the Claude book pass. Useful if you only want the raw transcript (e.g. for verbatim quote retrieval).
--style-ref <path> — point at a specific existing book file for synthesis to mimic. Auto-detects an existing book in the topic if absent.

Cost (rough)

Component	4hr video
Whisper API	~$1.45
Vision (slides, ~50 imgs)	~$0.05
Synthesis (Sonnet 4.6)	~$0.50
Total	~$2

Wall time: ~5-10 min for a 4hr video (Whisper runs 4-way parallel).

Workflow

Step 1 — Confirm intent with the user

Before running, confirm:

Input file path (must exist locally)
Topic folder — where it goes. Default to existing topics rather than creating new ones unless the content really doesn't fit.
Slides or not — ask if the source has slides/diagrams worth capturing. Defaults: workshop/training/presentation → yes, interview/podcast/audiobook → no.
Synthesis or not — almost always yes. Skip only when the user wants raw text only.

Step 2 — Run

node shared/transcribe.mjs "<input_file>" --topic <topic> --slug <slug> [--with-slides]

Run from the repo root (c:\Users\faris\agency-os). The script:

Extracts audio with ffmpeg (16kHz mono mp3)
Splits into ~10-min chunks
Sends chunks to Whisper API in parallel (4 simultaneous)
Stitches segments back with timestamps
(if --with-slides) detects slide changes via ffmpeg scene detection, screenshots each, captions each via Claude Haiku
Writes transcript markdown with inline screenshots
(if synthesis enabled) Calls Claude Sonnet 4.6 with the topic's existing synthesis file as a style reference, produces a structured synthesis, writes to synthesis/<slug>.md
Appends a row to <topic>/README.md

Step 3 — Hand back to the user

After the script finishes, output:

Full path to the transcript file
Full path to the book file (if synthesis ran)
A one-line "what's next" — usually: open the book file and refine the front-matter "core thesis" line in the README index, since the script leaves it as _fill in core thesis_.

Step 4 — Refine the synthesis

The script's Sonnet pass is a strong V1 but won't match a hand-edited book like 100m-money-models.md in depth. After the run, suggest:

"Open synthesis/<slug>.md — the synthesis is solid but a hand pass to add cross-references to the Cross-Book Playbook (in the topic README) and to flesh out the Application Map will make this much more useful long-term."

Don't do this pass automatically — it's slow and judgment-heavy. Let Faris choose to do it.

Topic-specific notes

`hormozi`

The existing synthesis/100m-money-models.md is the gold standard the synthesis tries to mimic.
After ingesting, the user typically wants to:

Update the Cross-Book Playbook in the topic README with new principles
Update the Application Map with new tactics for Faris's businesses

These are hand-edits, not script work. Suggest them but don't do them automatically.

Common pitfalls

Wrong path quoting on Windows — wrap input file in double quotes when it contains spaces.
Very long videos (>6hr) — synthesis truncates to 350K chars (~5hr of speech). For 8hr+ content, split into two sessions and run twice with different slugs.
Slide detection too aggressive/loose — threshold is 0.35. If you get hundreds of screenshots from a talking-head clip, re-run without --with-slides. If you miss obvious slide changes in a deck-heavy video, edit SCENE_THRESHOLD in shared/transcribe.mjs to 0.25 and re-run.
Whisper rate limits — script runs 4 chunks in parallel. If you hit 429s, lower PARALLEL in the script.
Existing slug — script writes to transcripts/<slug>.md and synthesis/<slug>.md and overwrites without asking. Use a different --slug to keep both.

Why this exists

Long-form video is the highest-density learning material Faris consumes (Hormozi workshops, Brunson talks, sales training, prospect deep-dives), but it's the hardest to retrieve from. This skill turns that material into structured, agent-readable knowledge that the copywriter, media-buyer, and sales-ops agents can pull from when designing offers, ad copy, and money models.

/transcribe