

Flux.2 Klein
#21 v Text na obrázekblack-forest-labs · od 2026-01-15 · 5× · naposledy 30. 6. 2026
19
Momentum
FLUX.2 [klein] 4B is a distilled text-to-image and image editing model by Black Forest Labs with 4 billion parameters, built on a rectified flow transformer architecture. It unifies text-to-image generation, single-reference editing, and multi-reference composition in a single compact architecture, achieving end-to-end inference in under one second. The model runs on consumer GPUs (RTX 3090/4070 and above) and is fully released under the Apache 2.0 license. It is step-distilled to 4 inference steps and uses a Qwen3-based text encoder.
Vývoj momenta
04.04.03.07.
Vlastnosti
| API Availability | Yes – official BFL REST API (flux-2-klein-4b); also available via Replicate, OpenRouter, fal.ai, Segmind, NVIDIA Build, among others |
| Benchmark Score (Text-to-Image) | Average CLIP score: 0.335 (benchmark on H100, 10 categories); Elo-based evaluation by BFL shows Pareto frontier for quality vs. latency/VRAM compared to Qwen and Z-Image models |
| Image Resolution (Max.) | Up to 4 megapixels (e.g., 2048×2048); minimum resolution 64×64; dimensions must be multiples of 16 |
| Fine-Tuning | Only via the base variant (FLUX.2-klein-base-4B): undistilled, intended for LoRA training and fine-tuning; the distilled 4B variant is not designed for fine-tuning |
| Generation Speed | Distilled: ~1.2 s on RTX 5090 (ComfyUI); 0.57 s on H100 at 1024×1024 (4 steps); sub-second on modern hardware according to BFL |
| Price Tier | API: starting at $0.014 per image (1 MP, BFL API); each additional megapixel +$0.001; free locally under Apache 2.0 |
| Memory Footprint (GB) | ~13 GB VRAM (BF16, official); FP8 quantization reduces to ~6–8 GB; NVFP4 up to 55% VRAM reduction vs. BF16 |