Multimodal Ad Generation & Background Replacement
One product image in, a complete video ad out β voiceover, script, and lifestyle background included. Built for Shopify apps, catalog platforms, and any e-commerce pipeline that needs scroll-stopping creatives at scale.
Why Shopify apps need this
Single-image to full ad
Upload one product photo and get back a composited video with voiceover and script β no editing suite required.
Zero-shot background swap
Replace plain white backgrounds with lifestyle scenes β marble countertops, outdoor tracks, studio sets β via a single prompt.
Built-in voiceover
Pass your ad script and choose a voice ID β the API handles TTS compositing into the final video asset.
Pipeline-friendly
Chain background replacement into ad generation for end-to-end automation. Webhook callbacks keep your app async.
Async + webhooks
Both endpoints are async β poll for status or register a webhook_url for push-based delivery.
Shopify-scale ready
Designed for catalog volumes β process hundreds of product images into ad-ready assets with bounded concurrency.
Endpoints covered in this tutorial
/v1/multimodal-ad/generationsβ Generate a complete video ad from a product image/v1/multimodal-ad/generations/:idβ Poll ad generation status/api/generate/replace-backgroundβ Swap product background via prompt/api/generate/status/:idβ Poll background replacement statusGenerate a multimodal ad (cURL)
Submit a product image, ad script, and voice selection to POST /v1/multimodal-ad/generations. The API composites Image-to-Video, Text-to-Speech, and optional avatar into a single ad asset.
curl -X POST https://api.creativeai.run/v1/multimodal-ad/generations \
-H "Authorization: Bearer $CREATIVEAI_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"product_image_url": "https://cdn.example.com/products/sneaker-white.png",
"script": "Step into comfort. The all-new CloudStep sneaker β lightweight, breathable, built for your everyday hustle.",
"voice_id": "nova",
"target_audience": "young_professionals",
"webhook_url": "https://your-app.com/webhooks/creativeai"
}'
# Response:
# {
# "id": "ad_a1b2c3d4e5f6",
# "status": "pending",
# "message": "Multimodal ad generation started. Video, Voiceover, and Script are being composited.",
# "created_at": "2026-03-17T10:30:00Z"
# }Poll for the completed ad
If you didn't set a webhook_url, poll the status endpoint until status is "completed".
# Poll for completion
curl https://api.creativeai.run/v1/multimodal-ad/generations/ad_a1b2c3d4e5f6 \
-H "Authorization: Bearer $CREATIVEAI_API_KEY"
# Response when complete:
# {
# "id": "ad_a1b2c3d4e5f6",
# "status": "completed",
# "video_url": "https://creativeai.run/output/ad-video.mp4",
# "audio_url": "https://creativeai.run/output/ad-voiceover.mp3",
# "script": "Step into comfort. The all-new CloudStep sneaker...",
# "completed_at": "2026-03-17T10:31:42Z"
# }Python: submit + poll ad generation
Production-ready Python with polling loop and webhook support. Uses httpx for HTTP calls.
import os
import time
import httpx
API_KEY = os.environ["CREATIVEAI_API_KEY"]
BASE = "https://api.creativeai.run"
def generate_ad(product_image_url: str, script: str, voice_id: str = "nova",
target_audience: str = "general", webhook_url: str | None = None) -> dict:
"""Submit a multimodal ad generation job and poll until completion."""
payload = {
"product_image_url": product_image_url,
"script": script,
"voice_id": voice_id,
"target_audience": target_audience,
}
if webhook_url:
payload["webhook_url"] = webhook_url
with httpx.Client(
base_url=BASE,
headers={"Authorization": f"Bearer {API_KEY}"},
timeout=60.0,
) as client:
# 1. Submit the job
resp = client.post("/v1/multimodal-ad/generations", json=payload)
resp.raise_for_status()
job = resp.json()
print(f"Job created: {job['id']} status={job['status']}")
# 2. Poll (skip if using webhooks)
if webhook_url:
return job
ad_id = job["id"]
for _ in range(60): # up to ~5 minutes
time.sleep(5)
poll = client.get(f"/v1/multimodal-ad/generations/{ad_id}")
poll.raise_for_status()
result = poll.json()
if result["status"] == "completed":
print(f"Done! Video: {result['video_url']}")
return result
if result["status"] == "failed":
raise RuntimeError(f"Ad generation failed: {result}")
raise TimeoutError("Ad generation timed out after 5 minutes")
# --- Example usage ---
result = generate_ad(
product_image_url="https://cdn.example.com/products/sneaker-white.png",
script="Step into comfort. The all-new CloudStep sneaker β lightweight, breathable, built for your everyday hustle.",
voice_id="nova",
target_audience="young_professionals",
)
print(f"Video URL: {result['video_url']}")
print(f"Audio URL: {result['audio_url']}")Replace a product background (cURL)
Swap the background of any product image via POST /api/generate/replace-background. Describe the new scene in prompt and optionally exclude artifacts with negative_prompt.
curl -X POST https://api.creativeai.run/api/generate/replace-background \
-H "Authorization: Bearer $CREATIVEAI_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"image_url": "https://cdn.example.com/products/sneaker-white.png",
"prompt": "Placed on a sleek marble countertop in a luxury apartment, soft golden hour lighting, bokeh background",
"negative_prompt": "blurry foreground, floating objects, distorted edges"
}'
# Response (HTTP 202):
# {
# "id": "gen_x7k9m2p4q1",
# "status": "pending",
# "credits": 4,
# "message": "Background replacement job queued. Poll /api/generate/status/{id} for result."
# }# Poll for the result
curl https://api.creativeai.run/api/generate/status/gen_x7k9m2p4q1 \
-H "Authorization: Bearer $CREATIVEAI_API_KEY"
# Response when complete:
# {
# "id": "gen_x7k9m2p4q1",
# "status": "completed",
# "output_url": "https://creativeai.run/output/sneaker-marble-bg.png"
# }Python: background replacement with polling
Submit the image, describe the new background, and poll /api/generate/status/:id until the output URL is ready.
import os
import time
import httpx
API_KEY = os.environ["CREATIVEAI_API_KEY"]
BASE = "https://api.creativeai.run"
def replace_background(image_url: str, prompt: str,
negative_prompt: str | None = None) -> dict:
"""Swap the background of a product image and poll for the result."""
payload = {
"image_url": image_url,
"prompt": prompt,
}
if negative_prompt:
payload["negative_prompt"] = negative_prompt
with httpx.Client(
base_url=BASE,
headers={"Authorization": f"Bearer {API_KEY}"},
timeout=60.0,
) as client:
# 1. Submit
resp = client.post("/api/generate/replace-background", json=payload)
resp.raise_for_status()
job = resp.json()
gen_id = job["id"]
print(f"Queued: {gen_id} credits_charged={job['credits']}")
# 2. Poll
for _ in range(36): # up to ~3 minutes
time.sleep(5)
poll = client.get(f"/api/generate/status/{gen_id}")
poll.raise_for_status()
result = poll.json()
if result["status"] == "completed":
print(f"Done! Output: {result['output_url']}")
return result
if result["status"] == "failed":
raise RuntimeError(f"Background replacement failed: {result}")
raise TimeoutError("Background replacement timed out")
# --- Example: lifestyle scene for a Shopify PDP ---
result = replace_background(
image_url="https://cdn.example.com/products/sneaker-white.png",
prompt="Outdoor running track at sunrise, dramatic lighting, shallow depth of field",
negative_prompt="indoor, studio, plain background",
)
print(f"New image: {result['output_url']}")Full pipeline: background swap β video ad
Chain both endpoints for maximum impact: first replace the product background with a lifestyle scene, then feed the new image into the multimodal ad generator for a complete video ad with voiceover.
import os
import time
import httpx
API_KEY = os.environ["CREATIVEAI_API_KEY"]
BASE = "https://api.creativeai.run"
def full_ad_pipeline(product_image_url: str, ad_script: str,
bg_prompt: str, voice_id: str = "nova") -> dict:
"""
End-to-end pipeline:
1. Swap the product background for a lifestyle scene
2. Generate a multimodal video ad from the new image
"""
headers = {"Authorization": f"Bearer {API_KEY}"}
with httpx.Client(base_url=BASE, headers=headers, timeout=60.0) as client:
# ---- Step 1: Background replacement ----
bg_resp = client.post("/api/generate/replace-background", json={
"image_url": product_image_url,
"prompt": bg_prompt,
})
bg_resp.raise_for_status()
bg_id = bg_resp.json()["id"]
print(f"[1/2] Background swap queued: {bg_id}")
# Poll background job
new_image_url = None
for _ in range(36):
time.sleep(5)
poll = client.get(f"/api/generate/status/{bg_id}")
poll.raise_for_status()
data = poll.json()
if data["status"] == "completed":
new_image_url = data["output_url"]
break
if data["status"] == "failed":
raise RuntimeError(f"Background swap failed: {data}")
if not new_image_url:
raise TimeoutError("Background swap timed out")
print(f"[1/2] New background image: {new_image_url}")
# ---- Step 2: Multimodal ad generation ----
ad_resp = client.post("/v1/multimodal-ad/generations", json={
"product_image_url": new_image_url,
"script": ad_script,
"voice_id": voice_id,
"target_audience": "shoppers",
})
ad_resp.raise_for_status()
ad_id = ad_resp.json()["id"]
print(f"[2/2] Ad generation queued: {ad_id}")
# Poll ad job
for _ in range(60):
time.sleep(5)
poll = client.get(f"/v1/multimodal-ad/generations/{ad_id}")
poll.raise_for_status()
result = poll.json()
if result["status"] == "completed":
return {
"background_image": new_image_url,
"video_url": result["video_url"],
"audio_url": result["audio_url"],
"script": result["script"],
}
if result["status"] == "failed":
raise RuntimeError(f"Ad generation failed: {result}")
raise TimeoutError("Ad generation timed out")
# --- Run the full pipeline ---
output = full_ad_pipeline(
product_image_url="https://cdn.example.com/products/sneaker-white.png",
ad_script="Step into comfort. The all-new CloudStep sneaker β lightweight, breathable, built for your everyday hustle.",
bg_prompt="Rooftop terrace overlooking a city skyline at golden hour, cinematic lighting",
)
print(f"Video: {output['video_url']}")
print(f"Audio: {output['audio_url']}")
print(f"Still: {output['background_image']}")Request Parameters Reference
POST /v1/multimodal-ad/generations
product_image_url (required) β URL of the source product imagescript (required) β The ad script to be spoken as voiceovervoice_id β Voice for TTS (default: "nova")avatar_id β Optional avatar presenter for talking-head overlaytarget_audience β Audience hint for styling (e.g. "young_professionals", "shoppers")webhook_url β Callback URL for push-based deliveryPOST /api/generate/replace-background
image_url β Direct URL of the image (provide this or generation_id)generation_id β ID of an existing completed generation to use as sourceprompt (required) β Description of the new background scenenegative_prompt β Things to exclude from the resultIntegration Checklist
webhook_url in production to avoid polling overhead at scale.generation_id per product SKU for traceability and re-generation.Ready to build product video ads at scale?
Grab an API key, paste the Python snippet above, and generate your first multimodal ad in under 2 minutes.