PORTAL / LIBRARY / heygen-ad-generator

[ CREATIVE ]

/heygen-ad-generator

Generates a finished HeyGen Video Agent ad — no CapCut, no ElevenLabs, no Krea.

Download the skill file (.md)

Placeholders like ACME Agency, <id> and you@example.com mark values that are per-agency — your install fills them with YOUR clients and accounts. If a section references a helper script you don't have yet, it ships with that workflow's install.

Skill: /heygen-ad-generator

Overview

Generates a finished HeyGen Video Agent ad — no CapCut, no ElevenLabs, no Krea. One prompt → HeyGen handles voice, b-roll, editing, text overlays, and music.

scenes mode is the DEFAULT first choice for video ads (avatar-less)

For a normal video ad, use --mode scenes: a pure b-roll + kinetic-text + voiceover montage with NO avatar. It's cheap (~$0.50/video, billed to the API dollar balance — see cost note below), fast, scalable, and pulls the client's brand colors automatically from their design-theme.json. This is what to reach for unless the brief specifically wants a presenter speaking to camera. (Validated on ACME Agency 2026-06-23 — ACME Agency/clients/ACME Agency/video-ads/heygen-scenes-batch/.)

Use the avatar modes (agent / multi-scene) only when the brief explicitly wants a person talking to camera (testimonial / founder pitch / explainer).

Cost model (important): in the API beta, Video Agent bills the API dollar balance, not plan credits. getRemainingQuota() reads GET /v2/user/remaining_quota where 60 units = $1; plan_credit (e.g. 200) is a separate counter and is untouched. A ~14–20s scenes video ≈ $0.47–0.66. The scenes pipeline prints the per-video cost and remaining balance, and puts the cost in the manifest + Slack post.

What this skill produces:

What this skill does NOT do:

When to use:

Trigger:


Critical Files


Scenes Script Format (DEFAULT for video — avatar-less)

Write this to ACME Agency/clients/<Client>/video-ads/<campaign-slug>/script.json:

{
  "client": "ACME Agency",
  "campaign": "<id>",
  "mode": "scenes",
  "language": "hr",
  "duration": 20,
  "vo": "Inspektor je na vratima. A evidencije? ... Pošaljite upit za besplatnu provjeru.",
  "scenes": [
    "[0-2s] HOOK: a restaurant glass door opens and a food-safety inspector with a clipboard steps in, tense cinematic lighting. Boxed overlay (top): \"Inspektor je stigao.\"",
    "[2-6s] frantic hands flipping through messy paper binders on a kitchen counter. Boxed overlay: \"Gdje su evidencije?\"",
    "[6-10s] calm close-up of a hand holding a smartphone with coral checkmarks. Boxed overlay: \"Sve na mobitelu.\"",
    "[10-15s] confident owner relaxed in a spotless modern kitchen. Boxed overlay: \"Spremni za inspekciju.\"",
    "[15-19s] solid navy end card. Centered boxed overlay: \"Pošaljite upit\""
  ]
}

Run it:

node ACME Agency/scripts/heygen_ads_generate.mjs "ACME Agency" \
  --script "ACME Agency/clients/ACME Agency/video-ads/<campaign>/script.json" \
  --mode scenes [--no-slack] [--drive]

Prints the per-video cost + remaining balance. Generate 3–4 variants (different hooks) by writing 3–4 script.json files and running each — each ≈ $0.50.


Script Format (Avatar Video Agent mode — only when a presenter is wanted)

Write this to ACME Agency/clients/<ClientName>/video-ads/<campaign-slug>/script.json:

{
  "client": "ACME Agency",
  "campaign": "<id>",
  "adType": "heygen",
  "mode": "agent",
  "language": "de",
  "duration": 30,
  "avatarGender": "male",
  "avatarPersona": "Northern European male, 40s, dark business suit, clean-shaven or light stubble, confident and authoritative",
  "voiceDescription": "German male, 40s, clear northern German accent, calm and authoritative",
  "brandColors": {
    "primary": "#your-channel",
    "accent": "#C9A84C",
    "text": "#FFFFFF"
  },
  "musicMood": "professional",
  "fullScript": "Seit Ihrem letzten Jobwechsel zahlen Sie den GKV-Höchstbeitrag...",
  "sceneGuidance": [
    { "timing": "0-8s", "mediaType": "stock footage", "visual": "German professional at laptop in bright office", "overlay": "Sie zahlen GKV-Maximum" },
    { "timing": "8-18s", "mediaType": "motion graphics", "visual": "Animated bar chart: GKV €900 vs PKV €600 monthly", "overlay": "Sparen Sie bis zu €300/Monat" },
    { "timing": "18-25s", "mediaType": "stock footage", "visual": "Person ACME Agencywing documents at home desk, relaxed", "overlay": "" },
    { "timing": "25-30s", "mediaType": "avatar closeup", "visual": "Avatar speaks CTA directly to camera", "overlay": "Jetzt kostenlosen Vergleich anfragen" }
  ]
}

Campaign slug: <keyword>-<audience/offer>-<YYYY-MM> e.g. <id>


VO Word Count Formula — CRITICAL

HeyGen speech rate ≈ 2.3–2.5 words/second
Formula: duration × 2.4 = target word count
Example: 30s = 72 words | 25s = 60 words | 20s = 48 words

Count words in fullScript BEFORE saving. Stay at or under the target — never go over.


Avatar Persona Guide

MarketLanguageDefault Persona
Croatia / Bosniahr / bsSouthern European female, 35-45, warm and professional, business casual
GermanydeNorthern European male, 40-50, dark business suit, clean-shaven, authoritative
AustriadeCentral European male/female, 35-50, serious and trustworthy
English B2BenProfessional Western, 35-50, confident expert, business formal
English B2CenRelatable, matches target demographic (age/style)

Tips:


Music Moods

MoodUse when
professionalInsurance, B2B, clinics, authority services
energeticFitness, youth, product reveals, urgency
warmLifestyle, family, wellness, emotional appeal
emotionalTestimonials, transformation stories
neutralLuxury, real estate, minimalist brands

Scene Guidance Tips

Write 3–5 scene guidance blocks. Each block tells HeyGen what to show and when.


Preflight (run BEFORE any HeyGen API call)

HeyGen credits are expensive and the Video Agent has a long polling timeout — failing fast saves real money. Validate everything BEFORE submitting the job.

  1. Client exists in clients.json. Resolve canonical key.
  2. HEYGEN_API_KEY set in .env. If missing → abort.
  3. orientation is portrait or landscape — NEVER 9:16 (HeyGen API silently rejects shortcut ratios). See CLAUDE.md ## AI Generation API Constraints.
  4. Multi-scene mode: every scene has a valid speaker_id. Multi-scene text_overlay requires ALL fields: type, font_family, font_size, font_weight, color, line_height, position, text_align — missing fields = silent failure.
  5. Polling timeout is ≥ 20 minutes in the script call. Less than 20 min = false negative on slow renders.
  6. Language is supported by HeyGen's voice library for the selected voice. List with --list-voices <lang> if uncertain.
  7. VO word count matches duration per the calibration table in this SKILL.md (## VO Word Count Formula). If mismatch → trim or extend before submission.
  8. drive_folder_id reachable if --drive flag passed.
  9. Slack channel resolves if reporting enabled.
  10. HeyGen API dollar balance check — the Video Agent (scenes/agent) endpoint bills the API dollar balance (getRemainingQuota().usd, 60 units = $1), NOT plan_credit. The scenes path now hard-aborts when balance < $0.70/video (--ignore-balance to override). For a batch of N videos, ensure balance ≥ N × $0.70 first. A near-empty wallet otherwise fails mid-render as a generic "unknown error"; the 200 plan_credit counter is a red herring (untouched, never billed, no fallback). See memory <id>.

If all checks pass, log "preflight: OK (mode=<agent|multi-scene>, duration=<n>s, language=<x>)" and proceed.


Workflow

Step 0 — Client lookup

Read ACME Agency/clients/clients.json. Extract:


Step 1 — Brand research

Check ACME Agency/clients/<ClientName>/brand-dna.md. Need at minimum:

If brand-dna.md doesn't exist: scrape website via Firecrawl, extract colors and tone, write brand-dna.md first.


Step 2 — Script

If --script provided: read, validate, show summary.

If --brief provided: use it directly to write the script.

If neither: ask 4 questions:

  1. What's this campaign about? (offer, key message, hook angle)
  2. Target audience? (age, job, situation)
  3. Duration? (20s / 25s / 30s) — default: 30s
  4. Tone? (authoritative / warm / energetic) — informs musicMood + avatarPersona

Write script.json to video-ads/<campaign>/script.json.

Count words in fullScript — verify against duration × 2.4.

Show confirmation before running:

Campaign:  <id> | Duration: 30s | Music: professional
Avatar:    Northern European male, 40s, dark suit | Language: German
Script:    72 words ✓ (30s × 2.4)
Scenes:    4 (stock footage → motion graphics → stock footage → avatar CTA)
Proceed?

Step 3 — Run the pipeline

node ACME Agency/scripts/heygen_ads_generate.mjs "ClientName" \
  --script "ACME Agency/clients/ClientName/video-ads/<campaign>/script.json" \
  --mode agent \
  [--avatar <avatar_id>]   # optional pin
  [--no-slack] [--drive]

Drive upload is OFF by default. Video goes local + HeyGen link only. Add --drive only when the video is final and ready to archive to Klijenti/<ClientName>/.

Pipeline phases:

Render time: ~10-15 minutes per video. Use --no-slack for silent local testing.


Output

ACME Agency/clients/<ClientName>/video-ads/<campaign>/
├── script.json
├── final-ad.mp4       ← finished 1080×1920 MP4, upload-ready
└── manifest.json

Drive: Klijenti/<ClientName>/Video Ads/<Year>/<campaign>/
Slack: video ready + Drive folder URL + direct download link

Verification (run AFTER pipeline completes — confirm the video shipped)

Check ALL of these before declaring done:

If any check fails, name the gap explicitly. Never claim success when verification fails.


Multi-Video Summary Report

When generating multiple videos (e.g. 3 hooks for one campaign), the final summary you post must use HeyGen URLs, never local file paths. Local paths are inaccessible to team members on Slack.

After all videos finish:

  1. Read each campaign's manifest.json → get _assets.heygenUrl
  2. Build the summary table with HeyGen links:
| # | Hook | HeyGen Link |
|---|------|-------------|
| 1 | Pain Point — "Dok vi ovo gledate..." | https://app.heygen.com/video/xxx |
| 2 | Social Proof — "687 novih upita..." | https://app.heygen.com/video/yyy |
| 3 | Us vs Them — "Platili ste agenciju..." | https://app.heygen.com/video/zzz |
  1. If --drive was used, add the Drive folder link below the table.

Rule: Never include local file paths (like ACME Agency/clients/.../final-ad.mp4) in the report — they mean nothing to Slack users. Use heygenUrl from the manifest for every video.


Tips


Multi-Scene Mode (fallback)

Use when you need guaranteed voice consistency (pinned voice_id per scene). Produces stiffer lip sync but never switches voices.

Script format: use scenes[] array instead of fullScript + sceneGuidance.

node ACME Agency/scripts/heygen_ads_generate.mjs "ACME Agency" \
  --script "ACME Agency/clients/ACME Agency/video-ads/<campaign>/script.json" \
  --mode multi-scene

See ACME Agency/clients/ACME Agency/video-ads/<id>/script.json for example.