Synthszr Charts — die großen AI-Marken im Wettkampf ums Podium

vLLM

#1 v LLM inference a serving

vllm · od Juni 2023 (offizielles erstes Release) · 40× · naposledy 30. 6. 2026

100

Momentum

vLLM is an open-source inference and serving engine for Large Language Models (LLMs), originally developed at UC Berkeley's Sky Computing Lab and maintained as a community project since 2023. Its core architecture is based on PagedAttention (virtual memory management of the KV cache) and continuous batching, delivering significantly higher throughput than naive serving approaches. vLLM supports 200+ model architectures from Hugging Face and runs on a broad range of hardware accelerators. The project is free to use (Apache 2.0) and is backed by an ecosystem of over 2,000 contributors and sponsors including NVIDIA, AMD, Google, AWS, and Intel.

Vývoj momenta

04.04.03.07.

Vlastnosti

License	Apache License 2.0
Price	Free / Open Source (no license fees; donations via GitHub & OpenCollective)
Release Date	June 2023 (first official release); currently v0.24.0 on PyPI (as of July 2026)

vLLM

Vlastnosti

Zdroje (40)

Další produkty v této kategorii: LLM inference a serving

Subscribe free. Unsubscribe the second it sucks.