PORTAL / LIBRARY / cinematic-ad-generator

[ CREATIVE ]

/cinematic-ad-generator

Generates **cinematic action-based video ads** with strong hooks, pattern interrupts, and character consistency.

Download the skill file (.md)

Placeholders like ACME Agency, <id> and you@example.com mark values that are per-agency — your install fills them with YOUR clients and accounts. If a section references a helper script you don't have yet, it ships with that workflow's install.

Skill: /<id>

Overview

Generates cinematic action-based video ads with strong hooks, pattern interrupts, and character consistency. Characters NEVER talk to camera — they perform actions, react, and the ElevenLabs voiceover narrates over the visuals.

This is the proven flow for generating scroll-stopping Facebook/Meta video ads. Built after extensive testing of HeyGen, multi-shot, lip sync, and Veo — none of which produced as reliable results as this per-scene cinematic approach.

What this skill produces:

What this skill does NOT do:


The Golden Rules

  1. NO talking-to-camera shots. Characters do actions: throw papers, burn money, walk, gesture, react. The voiceover is the narrator, the visuals are the metaphor.
  1. Hook in the first 3 seconds. Pattern interrupt — something visually unexpected. Burning money. Throwing papers. Slamming a laptop. Action that stops the scroll.
  1. Character consistency via NB2 reference. Generate one strong character portrait first, reuse as imageUrls reference for all scenes featuring that character.
  1. Per-scene NB2 → Kling. Each scene starts from its OWN precision-crafted NB2 image. This is the only way to get dramatic visuals (burning money, etc.) — multi-shot won't follow text-only prompts for action shots.
  1. Catbox for image URLs. Always upload NB2 images to catbox.moe before passing to Kling — Krea CDN URLs are unreliable for Kling's image fetcher.
  1. ElevenLabs as separate track. Kling direct API has no audio. ElevenLabs generates the German VO. CapCut overlays both.

Preflight (run BEFORE any expensive Krea/Kling/ElevenLabs call)

A cinematic-ad batch is the most expensive video pipeline in this workspace (6 NB2 images + 6 Kling 3.0 videos + ElevenLabs VO). Validate everything BEFORE spending credits.

  1. Client exists in clients.json (3-step cascade). Resolve canonical key.
  2. Required env vars: KREA_API_KEY, KLING_ACCESS_KEY, KLING_SECRET_KEY, ELEVENLABS_API_KEY all set in .env. If any missing → abort, name the missing ones.
  3. Catbox.moe is reachable — quick GET request. If down → abort with "catbox.moe upload host is down, retry later".
  4. drive_folder_id is present and reachable via listFiles().
  5. Slack channel resolves if reporting enabled.
  6. Kling duration cap: scene durations must be 5 or 10 for Kling 2.6, OR 3-15 for Kling 3.0. Reject anything else.
  7. Scene count is sane: 4-8 scenes. More than 8 = burns through credits with no quality return.
  8. Language is supported by ElevenLabs for <id> (HR, BS, DE, EN, ES, FR, IT, etc.). If unsupported → abort.
  9. Disk space ≥ 1 GB free in client folder (videos are big).
  10. CLIENT.md exists (or proceed with brief-only and explicitly note the gap).

If all checks pass, log "preflight: OK (n scenes × {duration}s each, est cost: ~Y compute units + ~Z s of VO)" and proceed.


Step-by-Step Workflow

Step 0 — Identify Client

Look up in ACME Agency/clients/clients.json (or 3-step cascade per Paradox CLAUDE.md). Extract:

Step 1 — Brief Collection

Ask up to 5 questions (skip any the user already answered in their initial message):

  1. Target audience — who exactly? (e.g. "Selbstständige earning >75K", "Angestellte 30-50 with chronic back pain")
  2. The offer / transformation — what they get from acting
  3. Existing script? — paste it, or skip (default: write one for them)
  4. Hook concept? — specific visual idea (e.g. "burning money") or skip (default: propose 2-3)
  5. Target duration? — 15 / 20 / 30 / 45 / 60 seconds (default: 30)

Detect language from client.market → Germany = de, Croatia/Bosnia = hr, etc.

Step 2 — Write the Script (the most important step)

This step has TWO paths — either the user gave you a script, or you write one.

Path A: User provided a script
  1. Parse it for: hook, problem, mechanism, solution, result, CTA
  2. Validate word count vs target duration:
  1. If too long, trim — cut filler, not core message
  2. Split into N scenes (see "Scene Count by Duration" table below)
  3. Skip to Step 2.5 (Approval Gate)
Path B: No script provided — WRITE ONE using direct-response principles

FIRST: Read .claude/skills/copywrite/PRINCIPLES.md to load the 12 direct-response copywriting principles. This is non-negotiable — that file has the frameworks you need.

Then apply this 6-beat video ad framework (maps directly to copywriting principles):

BEAT 1 — HOOK (Pattern Interrupt — Principle 2 hook type #4)
  → 3-7 word VO question that names the audience by their identity
  → Must stop the scroll in first 3 seconds
  → Examples: "Selbstständig in Deutschland?", "Verdienst du über 77.000 brutto?"

BEAT 2 — PROBLEM (specificity + emotion — Principles 3 + 4)
  → 8-12 word VO with concrete number, not vague pain
  → Fear of loss > desire for gain
  → Example: "Die GKV kostet dich jeden Monat über 900 Euro."

BEAT 3 — MECHANISM REVEAL (Principle 5 — your unique explanation)
  → 8-12 word VO that names WHY it works / what most people don't know
  → This is the differentiator — gives the viewer a reason to believe
  → Example: "Was die meisten nicht wissen: Dein Arbeitgeber zahlt die Hälfte mit."

BEAT 4 — PROOF / SOLUTION (specificity — Principle 4)
  → 8-12 word VO with the product/service in action + a number
  → Example: "ACME Agency prüft in 60 Sekunden, ob du sparen kannst."

BEAT 5 — RESULT (identity payoff — Principle 3)
  → 5-10 word VO showing the new state, the transformation
  → Example: "Bis zu 400 Euro weniger. Jeden Monat."

BEAT 6 — CTA (risk removal — Principle 8)
  → 5-8 word VO with the action + risk removal
  → Example: "Jetzt kostenlos prüfen — dauert 60 Sekunden."

Total VO word count must equal: <id> × 2.4 (de/hr) or × 2.6 (en).

For longer ads (45s+): Add an extra PROOF beat or a TESTIMONIAL beat between MECHANISM and RESULT.

For shorter ads (15-20s): Compress to 4 beats: HOOK + PROBLEM + SOLUTION + CTA.

Anti-AI sweep (Principle 12) — REQUIRED before approval gate

Before showing the script to the user, scan it for and remove:

Read it aloud. If it sounds like an AI wrote it, rewrite it.

Scene Count by Duration

Target DurationScenesPer-Scene DurationWord Budget (de/hr)
15s35s each~36 words
20s45s each~48 words
25s55s each~60 words
30s (default)65s each~72 words
45s95s each~108 words
60s125s each~144 words

Kling 3.0 image-to-video produces 5s clips reliably. Keep all scenes uniform at 5s.

Step 2.5 — Script Approval Gate (REQUIRED)

After writing the script (Path A or B), ALWAYS show the user this breakdown BEFORE generating any images:

═══════════════════════════════════════════════════════════════
SCRIPT — [Client Name] | [Audience] | [Duration]s | [N] scenes
═══════════════════════════════════════════════════════════════

Scene | Visual Hook                  | Voiceover ([Lang])
------|------------------------------|--------------------------------
  1   | Burning euro bills           | "Selbstständig in Deutschland?"
  2   | Frustrated at desk           | "Die GKV kostet dich..."
  3   | Phone shows comparison       | "ACME Agency prüft in 60 Sek..."
  4   | Walking into clinic          | "Sofort zum Facharzt..."
  5   | Relaxed, smiling             | "Bis zu 400 Euro weniger..."
  6   | Branded CTA card             | "Jetzt kostenlos prüfen."

Voice: Chris Norddeutscher (German)
VO word count: 68 / 72 max (within target)
═══════════════════════════════════════════════════════════════

Then ask:

"Approve this script and proceed to image generation? Or any scene you want to revise?"

Wait for explicit approval. Do not spend Krea credits without it.

Step 3 — Generate Character Reference Image (if needed)

If the ad features a recurring person, generate ONE strong NB2 portrait first:

import { <id> } from './ACME Agency/scripts/lib/krea.mjs';

const result = await <id>({
  prompt: 'Portrait of a [age] [nationality] [gender], [hair], [outfit], [setting], confident eye contact, photorealistic, editorial portrait, 1:1',
  aspectRatio: '1:1',
  batchSize: 2,  // generate 2 to pick the best
  resolution: '2K',
});
// Pick the best one — show user, get confirmation

Save the chosen CDN URL — use it as imageUrls[characterRef] in later NB2 calls for character consistency.

Step 4 — Generate 6 Scene Images via NB2

For EACH scene, write a precision NB2 prompt. Use these templates:

Hook scene (action close-up):

Extreme close-up macro shot of [DRAMATIC OBJECT/ACTION]. [SPECIFIC DETAILS]. 
Held in [hand/context]. Dark moody background, dramatic low-key lighting. 
Cinematic, intense, attention-grabbing. Shallow depth of field. 9:16 vertical

Character scene (use character reference):

The person in the reference image [SPECIFIC ACTION/EMOTION], [POSITION/POSE], 
[ENVIRONMENT DETAILS]. [LIGHTING]. [MOOD]. Photorealistic, 9:16 vertical
→ Pass character ref via imageUrls

Product/UI scene (no character):

[Subject — phone/equipment/product] showing [SPECIFIC UI/DETAIL]. 
[BRAND COLORS]. [LIGHTING]. Shallow depth of field. Photorealistic, 9:16 vertical

CTA card (no character):

Bold [brand color] solid background. Large [accent color] text reading 
[CTA TEXT] centered. Clean geometric heavy sans-serif. [Brand logo] below. 
Minimalist premium. 9:16 vertical

Generate all 6 sequentially via <id>(). Save locally to ACME Agency/clients/<Client>/video-ads/<campaign>/reference-images/.

Step 5 — Upload Images to catbox.moe (CRITICAL)

Krea CDN URLs are unreliable for Kling's image fetcher. Always upload NB2 images to catbox first:

import { uploadToCatbox } from './ACME Agency/scripts/lib/kling.mjs';

const catboxUrls = {};
for (const [name, localPath] of Object.entries(images)) {
  catboxUrls[name] = await uploadToCatbox(localPath, 'image/png');
}
// Save catbox URLs to a JSON file in the reference-images folder for re-use

Step 6 — Animate Each Scene with Kling 3.0 Direct

For each scene, call <id>() from ACME Agency/scripts/lib/kling.mjs:

import { <id> } from './ACME Agency/scripts/lib/kling.mjs';

for (const scene of scenes) {
  const videoPath = await <id>({
    prompt: scene.animationPrompt,  // describes MOTION, not the static frame
    model: 'kling-v3',
    aspectRatio: '9:16',
    duration: 5,
    startImage: catboxUrls[scene.name],
  });
  await copyFile(videoPath, `clips/${scene.name}.mp4`);
}

Animation prompt rules:

Step 7 — Generate ElevenLabs Voice (auto-selected by language)

Voice selection priority (highest first):

  1. --voice <id|name> CLI flag (user override)
  2. client.elevenlabs_voice_id from clients.json
  3. Auto-selection by client language

Voice by language table (verified IDs from current ElevenLabs account):

LanguageVoice IDNameNotes
de (German)j46AY0iVY3oHcnZbgEJgChris NorddeutscherNorth German pro, authoritative
de (alt)TUKJhQmz3RPYBNAgC5A1Clark ClearGerman pro, alternative
de (alt)DtAQqD4yK3kXSVPx7wFcPascal RGerman narrator/storyteller
hr (Croatian)ZLYZToA7aDsMbHwM9AOrLukaCroatian male, calm
hr (alt)FXFcxnjikw0naYO1PPrUAdnanCroatian male, casual
en (English)JBFqnCBsd6RMkjVDRZzbGeorgeBritish storyteller, premade (free)
en (alt)EXAVITQu4vr4xnSDxMaLSarahAmerican female, mature, premade

Note: German + Croatian voices are "professional" and require a paid ElevenLabs plan. English premade voices work on free plan.

import { <id>, generateSpeech } from './ACME Agency/scripts/lib/elevenlabs.mjs';

// Auto-select voice from language (or override)
const VOICE_BY_LANG = {
  de: 'j46AY0iVY3oHcnZbgEJg',  // Chris Norddeutscher
  hr: 'ZLYZToA7aDsMbHwM9AOr',  // Luka
  en: 'JBFqnCBsd6RMkjVDRZzb',  // George
};

// Detect language from client.market or script.language
const lang = client.market === 'Germany' ? 'de'
           : client.market === 'Croatia' || client.market === 'Bosnia' ? 'hr'
           : 'en';

const voiceId = client.elevenlabs_voice_id || VOICE_BY_LANG[lang];

// Conversational settings (works for all 3 languages)
const voiceSettings = { stability: 0.4, similarityBoost: 0.85, style: 0.1 };

await <id>({ scenes, voiceId, outputDir, voiceSettings });

// Also generate full track
const fullScript = scenes.map(s => s.text).join(' ');
await generateSpeech({ text: fullScript, voiceId, destPath: 'voiceover-full.mp3', ...voiceSettings });

Step 8 — Upload Everything to Drive

Folder structure: Klijenti/<Client>/Video Ads/<Year>/<Month>/<campaign>/

Upload: clips, voiceover MP3s, reference-images PNGs.

Step 9 — Post Slack Report

This is the ONLY Slack message for the entire execution. Do NOT post scene-by-scene status, script breakdowns, retry notifications, or any intermediate updates to Slack during Steps 1-8. All progress goes to stdout (console.log) only. The team reads this one final report, not a play-by-play.

Incident reference: 2026-04-10 ACME Agency — the subprocess posted 8+ separate messages to the client channel during execution. Do not repeat.

Use the standard format:

Verification (run AFTER Step 9 — confirm all assets actually shipped)

Cinematic ads have many moving parts that can silently fail. Check ALL of these before declaring done:

If any scene failed mid-batch, list the failed scene number(s) explicitly in the report. Never claim a 6-scene success when only 5 actually rendered.


CapCut Assembly Guide (delivered to user)

Include this in the Drive folder as capcut-guide.md:

1. Import all 6 clips from clips/ folder in order
2. Drop voiceover-full.mp3 on audio track 2
3. Adjust clip timing if VO doesn't perfectly align
4. Audio track 3: search CapCut music library for [mood] background music, set to -18dB
5. Audio track 4: SFX from CapCut library:
   - Scene 1: [relevant SFX — fire, paper rustling, etc.]
   - Scene N: [...]
6. Captions → Auto Captions → German → ACME Agencyw
7. Export 1080×1920, MP4, H.264

Critical Files

Reference Examples (proven campaigns)

Both campaigns: 6 scenes, 30s, character consistency, German VO, ready for CapCut.


Hard Constraints

RuleWhy
Always use Kling 3.0 direct (kling-v3) for videoVeo 3.1 doesn't support image-to-video. Kling 2.6 via Krea is unstable.
Always upload NB2 images to catbox before KlingKrea CDN URLs fail unpredictably for Kling's fetcher
Never use --mode multishot for action adsMulti-shot can't render dramatic single-frame hooks (burning money etc.)
Never use lip sync endpointAdds black artifacts in mouth area
Never describe characters as "talking to camera"No audio API = looks fake. Always action-based shots.
Auto-select voice from client.market (de/hr/en)See "Voice by language table" — overridable with --voice flag
Always show script approval gate before image generationKrea credits cost money. 30-second confirmation saves rework
Max 12 scenes / 60s totalBeyond this, viewers drop off. 30s is the sweet spot for Meta.

Why This Skill Exists

Built after testing every alternative:

ApproachProblem
HeyGen avatarBlack-box, no creative control, character can't do dramatic actions
Kling multi-shotCan't render dramatic single-frame visuals like burning money
Kling direct UGC + lip syncLip sync adds artifacts (black spots in mouth)
Veo 3.1 via KreaDoesn't support image-to-video — text-only
Per-scene NB2 → Kling 3.0 + ElevenLabs VOWORKS — what this skill does

This skill is the answer to the question: "How do I generate scroll-stopping cinematic Facebook ads with hooks that actually work?"