Synthszr Charts — die großen AI-Marken im Wettkampf ums Podium

Qwen2.5-32B-Instruct

#51 in Open-Source LLMs

alibaba · v2.5 · 32b instruct · since 2024-09-19 · 3× · last seen Jun 30, 2026

Momentum

Qwen2.5-32B-Instruct is an instruction-fine-tuned open-weight language model by Alibaba Cloud (Qwen team) with 32.5 billion parameters, released in September 2024 under the Apache 2.0 license. The model supports context windows of up to 128,000 tokens and can generate up to 8,192 tokens. Pretrained on 18 trillion tokens, it shows significant improvements over Qwen2 in instruction following, coding, mathematics, and structured outputs (JSON). It is freely downloadable as an open-weight model and accessible via various API providers.

Momentum trend

04.04.03.07.

Features

Context Window	131,072 tokens (128K) maximum input length; maximum output 8,192 tokens. Default config.json set to 32,768 tokens, long context enablable via rope_scaling.
Model Size (Parameters)	32.5 billion parameters (32.5B); 64 transformer layers; architecture: dense decoder-only with RoPE, SwiGLU, RMSNorm, attention QKV bias
Price Tier	Open-weight / free to self-host (Apache 2.0). Via OpenRouter API: approx. $0.79/M input tokens and $0.40/M output tokens (as of March 2025). Pay-as-you-go, no subscription required.
Memory Requirement	approx. 65 GB VRAM at BF16/FP16 (full precision, inference incl. weights + KV cache); approx. 33 GB at INT8; approx. 18 GB at INT4 quantization. Recommended GPU for FP16: A100 80GB.

Qwen2.5-32B-Instruct

Features

Sources (3)

More products in this category: Open-Source LLMs

Subscribe free. Unsubscribe the second it sucks.