

Nemotron 3 Nano Omni
#18 in Multimodale Modellenvidia · v3 · nano omni · siet 2026-04-28 · 26× · tolest 30. Juni 2026
NVIDIA Nemotron 3 Nano Omni is an open multimodal large language model with 30 billion total parameters and only 3 billion active parameters per inference (MoE design). It is built on a hybrid Mamba-Transformer MoE architecture and natively processes text, image, video, and audio in a single inference loop, producing text output. The model is designed to function as a multimodal perception sub-agent within larger agentic systems. It supports a context window of up to 256,000 tokens (up to 300K per some sources) and achieves top results on multiple multimodal leaderboards including MMlongbench-Doc, OCRBenchV2, WorldSense, DailyOmni, VoiceBench, and MediaPerf according to NVIDIA.
Features
| Context Window (Tokens) | 256,000 tokens (official context window per NVIDIA NIM / HuggingFace model card); max. 300,000 tokens per OpenRouter listing |
| License | NVIDIA Open Model License (Nemotron Open Model License) – commercially usable, with model-specific terms (not Apache 2.0) |
| Release Date | April 28, 2026 |