

Qwen-Image-2.0
#31 v Multimodální modelyalibaba · v2.0 · od Februar 2026 · 6× · naposledy 30. 6. 2026
12
Momentum
Qwen-Image-2.0 is an image generation and editing model released by Alibaba in February 2026 with 7 billion parameters. It unifies text-to-image generation and image editing in a single architecture, renders professional typography with prompts up to 1,000 tokens, and generates natively at 2,048×2,048 pixel resolution. The model ranks #1 on AI Arena in both categories (text-to-image and image editing).
Vývoj momenta
04.04.03.07.
Vlastnosti
| Context Window (Tokens) | Up to 1,000 tokens prompt input (for text-to-image generation and image editing) |
| Multimodal Inputs | Text prompts (up to 1,000 tokens) + image inputs (for image editing); output: images up to 2048×2048 px natively. No video input documented – image/text inputs only. |
| On-Device vs. Cloud | Cloud (API access via Alibaba Cloud BaiLian / DashScope; open weights not yet released at time of research – API-only invitation test) |
| Price per Unit | $0.035 per generated image (via Qwen Cloud / DashScope, international endpoint; rate limit: 120 RPM) |
| Vision-Language Benchmark Score | DPG-Bench: 88.32 (vs. FLUX.1 12B: 83.84); GenEval: ~0.91; #1 on AI Arena ELO leaderboard (blind human evaluation) in the categories text-to-image generation and image editing |