API Performance & Latency
Sub-200ms TTFB, 1.2-second image generation, 45-second video generation. Real numbers from production β not cherry-picked benchmarks.
API TTFB
<180ms
p50 global
Image Gen
1.2s
Seedream 1024Γ1024
Video Gen
~45s
Kling 5s clip
Uptime
99.97%
rolling 90 days
Per-Model Latency Benchmarks
Real production p50 numbers across every model we serve. Updated daily from aggregated telemetry.
Seedream V1.5
ImageSeedream V2.0
ImageKling V2.0
VideoKling V3.0
VideoVidu Q1
VideoPercentile Breakdown
Tail latency matters. Here's what p50 through p99 look like across our core metrics.
| Percentile | TTFB | Image Gen (1024px) | Video Gen (5s clip) |
|---|---|---|---|
| p50 | 142ms | 1.18s | 43s |
| p75 | 168ms | 1.34s | 48s |
| p90 | 201ms | 1.62s | 56s |
| p99 | 312ms | 2.41s | 72s |
How We Compare
Side-by-side with other AI generation API providers on the metrics that matter most.
| Provider | TTFB | Image Gen | Video Gen | Uptime |
|---|---|---|---|---|
| CreativeAI | 142β195ms | 1.2s | 38β45s | 99.97% |
| Provider A | 350β600ms | 3.5s | 90β120s | 99.5% |
| Provider B | 500β900ms | 4.2s | 120β180s | 99.2% |
Built for Speed at Scale
The infrastructure behind the numbers β from edge routing to GPU auto-scaling.
Global Edge Routing
Requests hit the nearest PoP before reaching GPU clusters. Median network hop < 40ms worldwide.
Dedicated GPU Pools
No cold starts. Warm model instances on A100 / H100 clusters with auto-scaling for burst traffic.
99.97% Uptime SLA
Redundant inference backends with automatic failover. Status page transparency with 90-day rolling metrics.
Real-Time Monitoring
Per-request latency traces exposed via response headers. Integrate with your own Datadog / Grafana dashboards.
Auto-Scaling
Burst from 10 to 10,000 concurrent requests without pre-warming. Queue-based scheduling with priority tiers.
Async + Webhooks
Fire-and-forget generation with webhook callbacks. No polling overhead β get notified the instant results are ready.
Per-Request Observability
Every response ships latency headers you can pipe straight into your monitoring stack.
# Response headers on every API call
X-Request-Duration: 1247ms
X-Queue-Wait: 12ms
X-Inference-Time: 1183ms
X-Model: seedream-v1.5
X-Region: us-east-1
# Pipe into Datadog, Grafana, or your own dashboards
curl -s -o /dev/null -w "TTFB: %{time_starttransfer}\nTotal: %{time_total}" \
https://api.creativeai.run/v1/generate