Synthszr Charts — die großen AI-Marken im Wettkampf ums Podium

Qwen2.5-32B-Instruct

#51 in Open-Source-Spraakmodelle

alibaba · v2.5 · 32b instruct · siet 2024-09-19 · 3× · tolest 30. Juni 2026

Momentum

Qwen2.5-32B-Instruct is an instruction-fine-tuned open-weight language model by Alibaba Cloud (Qwen team) with 32.5 billion parameters, released in September 2024 under the Apache 2.0 license. The model supports context windows of up to 128,000 tokens and can generate up to 8,192 tokens. Pretrained on 18 trillion tokens, it shows significant improvements over Qwen2 in instruction following, coding, mathematics, and structured outputs (JSON). It is freely downloadable as an open-weight model and accessible via various API providers.

Momentum-Verloop

04.04.03.07.

Features

Context Window	131,072 tokens (128K) maximum input length; maximum output 8,192 tokens. Default config.json set to 32,768 tokens, long context enablable via rope_scaling.
Model Size (Parameters)	32.5 billion parameters (32.5B); 64 transformer layers; architecture: dense decoder-only with RoPE, SwiGLU, RMSNorm, attention QKV bias
Price Tier	Open-weight / free to self-host (Apache 2.0). Via OpenRouter API: approx. $0.79/M input tokens and $0.40/M output tokens (as of March 2025). Pay-as-you-go, no subscription required.
Memory Requirement	approx. 65 GB VRAM at BF16/FP16 (full precision, inference incl. weights + KV cache); approx. 33 GB at INT8; approx. 18 GB at INT4 quantization. Recommended GPU for FP16: A100 80GB.

Qwen2.5-32B-Instruct

Features

Belege (3)

Mehr Produkten in disse Kategorie: Open-Source-Spraakmodelle

Subscribe free. Unsubscribe the second it sucks.