Synthszr Charts — die großen AI-Marken im Wettkampf ums Podium
synthszr charts
microsoft

MAI-Transcribe-1

#16 in Transkription (STT)

microsoft · v1 · siet 2. April 2026 · 16× · tolest 30. Juni 2026

8
Momentum

MAI-Transcribe-1 is Microsoft's first in-house automatic speech recognition (ASR) model, built by the MAI Superintelligence team, converting speech into text across 25 languages. Microsoft states it achieves the lowest Word Error Rate (WER, ~3.9%) on the FLEURS benchmark, outperforming Whisper-large-V3, GPT-Transcribe, ElevenLabs Scribe v2, and Gemini 3.1 Flash-Lite. It runs about 2.5x faster than Azure Fast Transcription at roughly 50% lower GPU cost, starting at $0.36 per audio hour. The model is available in public preview via Microsoft Foundry and Azure Speech, but does not yet support real-time transcription, speaker diarization, or keyword/context biasing (Microsoft states these are planned for a future update).

Momentum-Verloop
04.04.03.07.

Features

Real-Time StreamingNot supported (batch model); real-time transcription reportedly in development by Microsoft
LatencyBatch transcription 2.5x faster than Azure Fast Transcription; ~69x real-time according to Artificial Analysis
PlatformMicrosoft Foundry / Azure Speech (LLM Speech API); integrated into Copilot, Teams, Bing, PowerPoint
PriceFrom $0.36 per audio hour
Release DateApril 2, 2026 (Public Preview)
Languages25 languages (incl. English, German, French, Spanish, Hindi, Japanese, Korean, Chinese, Arabic)

Belege (16)

Mehr Produkten in disse Kategorie: Transkription (STT)

Subscribe free. Unsubscribe the second it sucks.

High-signal news across AI, business, UX, and tech. Every morning.