Can't generate video in real-time? Pre-generate during idle time and cache for instant playback. Build zero-latency agent experiences with CreativeAPI.
Real-time video generation isn't possible — yet. Kling, Veo, and other models need 5-8 seconds to generate a 5-second video. Here's how to work around it.
Pre-defined prompts for common agent scenarios
Generate up to 20 videos per API call
Receive completion notifications instantly
Store videos for instant retrieval during calls
Python implementation for pre-generation workflow with batch API and webhook handling.
# Pre-generation Workflow for Video Agents
import asyncio
import httpx
from datetime import datetime
class AgentVideoPreGenerator:
def __init__(self, api_key: str, cache_url: str):
self.api_key = api_key
self.cache_url = cache_url
self.base_url = "https://api.creativeai.run/v1"
async def pregenerate_batch(self, prompts: list[str]) -> list[str]:
"""Pre-generate videos for upcoming agent calls."""
# Submit batch of videos
batch_request = {
"videos": [
{
"prompt": prompt,
"model": "kling-v3",
"duration": 5,
"aspect_ratio": "16:9"
}
for prompt in prompts
],
"webhook_url": f"{self.cache_url}/webhook/complete",
"failover": True # Auto-switch to backup model
}
async with httpx.AsyncClient() as client:
response = await client.post(
f"{self.base_url}/video/batch",
json=batch_request,
headers={
"Authorization": f"Bearer {self.api_key}",
"Content-Type": "application/json"
}
)
batch = response.json()
return batch["job_ids"]
async def handle_webhook(self, payload: dict):
"""Process webhook completion - store in cache."""
video_data = {
"job_id": payload["id"],
"video_url": payload["output"]["video_url"],
"thumbnail_url": payload["output"]["thumbnail_url"],
"prompt": payload["prompt"],
"model": payload["model"],
"cached_at": datetime.utcnow().isoformat(),
"expires_at": datetime.utcnow().add(hours=24).isoformat()
}
# Store in your cache layer (Redis, S3, etc.)
await self.cache_video(video_data)
return {"status": "cached"}
async def get_video_for_call(self, scenario: str) -> str:
"""Retrieve pre-generated video for live call."""
# Match cached video to scenario
cached = await self.cache.get(matching=scenario)
if cached:
return cached["video_url"]
# Fallback: generate synchronously (not ideal)
return await self.generate_fallback(scenario)
# Usage: Pre-generate during idle time
generator = AgentVideoPreGenerator(
api_key="rph_live_...",
cache_url="https://your-cache.example.com"
)
# Pre-generate common scenarios before calls
prompts = [
"Professional greeting in office setting, friendly smile",
"Explaining product features with hand gestures",
"Nodding in agreement during conversation",
"Thinking pause before responding",
"Closing summary with wave goodbye"
]
# Run during low-traffic hours
job_ids = await generator.pregenerate_batch(prompts)| Approach | Latency | Reliability | User Experience | Scaling |
|---|---|---|---|---|
| Real-time Generation | 5-8 seconds | Model dependent | Awkward pause | Limited |
| Pre-generationRecommended | <100ms | 99.9%+ | Seamless | Unlimited |
| Company Type | Example | Pre-generated Prompts | Volume |
|---|---|---|---|
| AI Sales Demo Agents | Supersonik-style product demos | Greeting, feature highlight, pricing, close | 50-200 videos/session |
| Customer Support Agents | Keyframe Labs video calls | Empathy responses, solutions, follow-ups | 20-50 videos/shift |
| Education Tutors | Subject (edtech) courses | Concept explanations, examples, encouragement | 100+ videos/lesson |
| Financial Advisors | Storyline partnership | Portfolio updates, recommendations, summaries | 10-30 videos/client |
Kling, Veo, Seedance, Vidu — automatic failover if one model is slow.
3 retries with exponential backoff. HMAC signing for security.
Generate up to 20 videos per API call. Parallel processing.
Batch generation reduces per-video cost. Pre-generate during off-peak hours for maximum savings.
Get your CreativeAPI key, implement pre-generation workflow, and deliver seamless video agent experiences.
Related Resources