Language

Synthszr Charts — die großen AI-Marken im Wettkampf ums Podium

Cartesia

#13

cartesia · seit Erstes Sonic-TTS-Modell veröffentlicht im Mai 2024; aktuelles Modell Sonic-3.5 veröffentlicht am 16. Juni 2026 · 11× · zuletzt 30. Juni 2026

Momentum

Cartesia is an AI text-to-speech product from the eponymous startup, built on proprietary state-space models (SSMs) rather than classic transformer architectures. Its current flagship model, Sonic-3.5, is delivered via a streaming API with very low latency (sub-90ms time-to-first-audio), supports 40+ languages, expressive delivery (including laughter), and instant voice cloning. The product is offered as an API/SDK, web playground, and as the foundation for its own voice-agent platform ("Line"), with tiered pricing from free to enterprise.

Historique du momentum

04.04.03.07.

Fonctionnalités

Real-Time Streaming	Yes, streaming-first TTS API for real-time voice generation in voice agents
Latency	Sub-90ms time-to-first-audio (Sonic-3.5); some reports cite ~82ms or 100ms p90
License	Commercial SaaS usage via paid plans; separate open-source project 'Edge' (Apache 2.0) for on-device SSMs
Platform	Cloud API, web playground, on-premises and on-device deployment (SDKs for developers)
Price	Free $0/mo (20K credits); Pro $5/mo; Startup $49/mo; Scale $299/mo; Enterprise on request
Release Date	Sonic (first version) May 2024; Sonic-3.5 released June 16, 2026
Languages	42 languages natively supported (incl. English, Hindi, Spanish, French, German, Japanese, Hebrew)
Voice Cloning	Instant voice cloning possible with just 3–10 seconds of audio; 'Pro Voice Cloning' also available

Cartesia

Fonctionnalités

Preuves (11)

Subscribe free. Unsubscribe the second it sucks.