

Mellum2
#5jetbrains · v2 · seit 2026-06-02 · 5× · zuletzt 29. Juni 2026
17
Momentum
Mellum2 is a language model with 12 billion parameters using a Mixture-of-Experts architecture, specifically optimized by JetBrains for software development tasks. The model features 2.5 billion active parameters and was trained on approximately 11 trillion tokens. It was released as open-source and is designed for coding, reasoning, tool use, and agentic workflows.
Historique du momentum
04.04.03.07.
Fonctionnalités
| Context Size | 131,072 tokens (≈128K); achieved through layer-selective YaRN scaling after pre-training. Architecture combines sliding-window attention (on 3 of 4 layers) with full-attention layers. |