Synthszr Charts — die großen AI-Marken im Wettkampf ums Podium
synthszr charts
omlx

oMLX

#10 v Lokální LLM runtime

omlx · od 2026-02-13 · 2× · naposledy 29. 6. 2026

16
Momentum

oMLX is a native macOS inference server for Apple Silicon (M1 or later), built on Apple's MLX framework. Its core feature is a two-tier KV cache (hot tier in RAM, cold tier on SSD in safetensors format) that persists cache blocks across server restarts. The server supports text LLMs, VLMs, OCR models, embeddings, and rerankers, and exposes both an OpenAI-compatible and an Anthropic-compatible REST API. It is managed via a native macOS menu bar app (not Electron) with a supplementary web admin dashboard.

Vývoj momenta
04.04.03.07.

Vlastnosti

API TypeOpenAI-compatible (/v1/chat/completions) + Anthropic-compatible (/v1/messages); FastAPI-based
Inference BackendApple MLX (mlx-lm / mlx-vlm); BatchGenerator for continuous batching; two-tier paged KV cache (RAM + SSD)
Maximum Model Size (GB RAM)Minimum 16 GB RAM; 64 GB+ recommended; tested configurations up to 512 GB (Mac Studio M3 Ultra)
Platforms (OS Support)macOS 15+ (Sequoia) on Apple Silicon (M1/M2/M3/M4) — no Windows, no Linux, no Intel Mac
Price TierFree, open source (Apache License 2.0)
UI TypeNative macOS menu bar app (SwiftUI/PyObjC, no Electron) + web admin dashboard (/admin) for model management, chat, benchmarks, and monitoring

Zdroje (2)

Další produkty v této kategorii: Lokální LLM runtime

Subscribe free. Unsubscribe the second it sucks.

High-signal news across AI, business, UX, and tech. Every morning.