Synthszr Charts — die großen AI-Marken im Wettkampf ums Podium

ollama

#1 in Lokale LLM-Runtimes

ollama · seit 2023-07-08 · 101× · zuletzt 02. Juli 2026

100

Momentum

Ollama ist eine quelloffene, lokal betriebene Laufzeitumgebung für Large Language Models (LLMs), die unter der MIT-Lizenz kostenfrei verfügbar ist. Die Software wurde 2023 von Michael Chiang und Jeffrey Morgan gegründet und ermöglicht es, Modelle wie Llama, Mistral, Qwen, Gemma oder DeepSeek mit wenigen Befehlen auf eigener Hardware zu starten. Als Inferenz-Backend kommt primär llama.cpp zum Einsatz; ab Version 0.19 (März 2026) wird auf Apple Silicon zusätzlich Apples MLX-Framework unterstützt. Ollama stellt eine OpenAI-kompatible REST-API bereit und unterstützt macOS, Linux und Windows ohne Cloud-Abhängigkeit.

Momentum-Verlauf

04.04.03.07.

Features

Deployment (Self-host/Cloud)	Primär Self-host (lokal, Docker, eigener Server). Optional: Ollama Cloud (GA seit Sept. 2025) – gehosteter Inferenzdienst auf NVIDIA-Datacenter-Hardware, OpenAI-kompatibler Endpunkt, kein Daten-Logging.
Durchsatz/Latenz	Min.-Config (CPU, 8 GB RAM): 3–8 t/s @ 7B. 16 GB VRAM (RTX/M-Series): 30–60 t/s @ 7B–14B. Apple M4 / RTX 4090: ~40 t/s @ 7B Q4. Cloud: niedriges TTFT + hoher Durchsatz, kein SLA.
Lizenz	MIT License (Core/CLI, github.com/ollama/ollama). GUI-App (ab 2025) separat ohne veröffentlichte Lizenz.
Plattform	macOS (≥14 Sonoma), Windows 10/11 (amd64, arm64), Linux (glibc 2.31+, amd64/arm64), Docker. GPU: NVIDIA CUDA (Compute 5.0+), AMD ROCm (Linux), Apple Metal / MLX (Apple Silicon).
Preis	Lokal: kostenlos & unbegrenzt. Cloud: Free ($0), Pro ($20/Monat oder $200/Jahr), Max ($100/Monat). Abrechnung nach GPU-Zeit, kein Token-Cap.
Protokoll-Kompatibilität	Eigene REST-API (Port 11434, NDJSON-Streaming). OpenAI Chat Completions API-kompatibel. Anthropic Messages API-kompatibel. Python- und JavaScript/TypeScript-Bibliotheken. Structured Outputs (JSON Schema).
Release-Datum	Juli 2023 (erste öffentliche GitHub-Veröffentlichung); aktuell v0.30.10 (Juni 2026)
Unterstützte Modelle/Provider	Llama, Gemma 4, Qwen, DeepSeek, Mistral, Phi, gpt-oss, Kimi, GLM, MiniMax, LLaVA u.v.m. – vollständige Bibliothek unter ollama.com/library. Chat, Code, Vision, Embeddings, Reasoning.

ollama

Features

Belege (60)

Weitere Produkte in dieser Kategorie: Lokale LLM-Runtimes

Subscribe free. Unsubscribe the second it sucks.