Synthszr Charts — die großen AI-Marken im Wettkampf ums Podium
synthszr charts
omlx

oMLX

#10 in Lokale LLM-Runtimes

omlx · siet 2026-02-13 · 2× · tolest 29. Juni 2026

16
Momentum

oMLX is a native macOS inference server for Apple Silicon (M1 or later), built on Apple's MLX framework. Its core feature is a two-tier KV cache (hot tier in RAM, cold tier on SSD in safetensors format) that persists cache blocks across server restarts. The server supports text LLMs, VLMs, OCR models, embeddings, and rerankers, and exposes both an OpenAI-compatible and an Anthropic-compatible REST API. It is managed via a native macOS menu bar app (not Electron) with a supplementary web admin dashboard.

Momentum-Verloop
04.04.03.07.

Features

API TypeOpenAI-compatible (/v1/chat/completions) + Anthropic-compatible (/v1/messages); FastAPI-based
Inference BackendApple MLX (mlx-lm / mlx-vlm); BatchGenerator for continuous batching; two-tier paged KV cache (RAM + SSD)
Maximum Model Size (GB RAM)Minimum 16 GB RAM; 64 GB+ recommended; tested configurations up to 512 GB (Mac Studio M3 Ultra)
Platforms (OS Support)macOS 15+ (Sequoia) on Apple Silicon (M1/M2/M3/M4) — no Windows, no Linux, no Intel Mac
Price TierFree, open source (Apache License 2.0)
UI TypeNative macOS menu bar app (SwiftUI/PyObjC, no Electron) + web admin dashboard (/admin) for model management, chat, benchmarks, and monitoring

Belege (2)

Mehr Produkten in disse Kategorie: Lokale LLM-Runtimes

Subscribe free. Unsubscribe the second it sucks.

High-signal news across AI, business, UX, and tech. Every morning.