

Gemma 4
#1google · v4 · seit 2026-04-02 · 319× · zuletzt 30. Juni 2026
100
Momentum
Google Gemma 4 is an open-weight model family released by Google DeepMind on April 2, 2026, under the Apache 2.0 license. It spans five sizes (E2B, E4B, 12B, 26B A4B, 31B) across two architectures – dense (31B) and Mixture-of-Experts (26B A4B) – and is the first Gemma generation to natively support text, image, audio, and video across all sizes. Models target both on-device deployment (smartphones, edge devices) and consumer GPUs/workstations, and include configurable thinking/reasoning modes and native function calling.
Historique du momentum
04.04.03.07.
Fonctionnalités
| Key Benchmark (%) | AIME 2026: 89.2% (31B); GPQA Diamond: 84.3% (31B); LMArena Score: 1452 (31B) / 1441 (26B MoE) |
| Context Window (Tokens) | 128K (E2B, E4B); 256K (12B, 26B A4B, 31B) |
| License | Apache 2.0 – commercial use without restrictions (no MAU caps) |
| Multimodality | Input: Text + image (all variants), video & audio native (E2B, E4B, 12B); Output: Text only |
| Platform | Hugging Face, Kaggle, Ollama (weights); Google AI Studio, Vertex AI (API); Local: llama.cpp, vLLM, MLX, LM Studio, Ollama; On-device: Android AICore, LiteRT-LM |
| Price | Weights free (open-weight); API via Google AI Studio / third-party providers (see price per 1M tokens) |
| Price per 1M Tokens | 31B: $0.12 input / $0.35 output; 26B A4B: $0.06 input / $0.30–0.33 output; E4B: $0.20/$0.20; E2B: free (via Google) |
| Release Date | April 2, 2026 (E2B/E4B/26B/31B); 12B Unified followed later (May/June 2026) |