Language

Synthszr Charts — die großen AI-Marken im Wettkampf ums Podium

ollama

ollama · seit 2023-07-08 · 101× · zuletzt 02. Juli 2026

100

Momentum

Ollama is an open-source, locally operated runtime environment for large language models (LLMs), available free of charge under the MIT license. Founded in 2023 by Michael Chiang and Jeffrey Morgan, it allows users to run models such as Llama, Mistral, Qwen, Gemma, or DeepSeek on their own hardware with just a few commands. The primary inference backend is llama.cpp; from version 0.19 (March 2026) onwards, Apple's MLX framework is additionally supported on Apple Silicon. Ollama provides an OpenAI-compatible REST API and supports macOS, Linux, and Windows without any cloud dependency.

Historique du momentum

04.04.03.07.

Fonctionnalités

Deployment (Self-Hosted/Cloud)	Primarily self-hosted (local, Docker, own server). Optional: Ollama Cloud (GA since Sept. 2025) – hosted inference service on NVIDIA data center hardware, OpenAI-compatible endpoint, no data logging.
Throughput/Latency	Min. config (CPU, 8 GB RAM): 3–8 t/s @ 7B. 16 GB VRAM (RTX/M-Series): 30–60 t/s @ 7B–14B. Apple M4 / RTX 4090: ~40 t/s @ 7B Q4. Cloud: low TTFT + high throughput, no SLA.
License	MIT License (core/CLI, github.com/ollama/ollama). GUI app (from 2025) separate, no published license.
Platform	macOS (≥14 Sonoma), Windows 10/11 (amd64, arm64), Linux (glibc 2.31+, amd64/arm64), Docker. GPU: NVIDIA CUDA (Compute 5.0+), AMD ROCm (Linux), Apple Metal / MLX (Apple Silicon).
Price	Local: free & unlimited. Cloud: Free ($0), Pro ($20/month or $200/year), Max ($100/month). Billed by GPU time, no token cap.
Protocol Compatibility	Own REST API (port 11434, NDJSON streaming). OpenAI Chat Completions API-compatible. Anthropic Messages API-compatible. Python and JavaScript/TypeScript libraries. Structured outputs (JSON schema).
Release Date	July 2023 (first public GitHub release); currently v0.30.10 (June 2026)
Supported Models/Providers	Llama, Gemma 4, Qwen, DeepSeek, Mistral, Phi, gpt-oss, Kimi, GLM, MiniMax, LLaVA, and many more – full library at ollama.com/library. Chat, code, vision, embeddings, reasoning.

ollama

Fonctionnalités

Preuves (60)

Subscribe free. Unsubscribe the second it sucks.