Synthszr Charts — die großen AI-Marken im Wettkampf ums Podium

Nemotron 3.5 ASR

#9 in Transcription (STT)

nvidia · v3.5 · asr · since 2026-06-04 · 2× · last seen Jun 29, 2026

Momentum

Nemotron 3.5 ASR (nvidia/nemotron-3.5-asr-streaming-0.6b) is a multilingual, streaming-capable automatic speech recognition model from NVIDIA with 600 million parameters. It is built on a Cache-Aware FastConformer-RNNT architecture that eliminates overlapping recomputation, enabling very low end-to-end latency at high GPU concurrency. A single checkpoint covers 40 language-locale combinations (approximately 36 languages), with native punctuation and capitalization support. The model is released as open weights under the OpenMDW-1.1 license, available on Hugging Face and NVIDIA NGC.

Momentum trend

04.04.03.07.

Features

Latency (ms)	Sub-100ms end-of-utterance latency; runtime-configurable chunk sizes: 80ms, 160ms, 320ms, 560ms, 1,120ms – no retraining required
Model Size (Parameter Count)	600 million parameters (0.6B)
Processing Speed (x Realtime)	~17x more concurrent streams compared to buffered streaming (Parakeet RNNT 1.1B) on a single NVIDIA H100; at 80ms setting 240 vs. 14 parallel streams, at 1,120ms 2,400 vs. 400

Nemotron 3.5 ASR

Features

Sources (2)

More products in this category: Transcription (STT)

Subscribe free. Unsubscribe the second it sucks.