Synthszr Charts — die großen AI-Marken im Wettkampf ums Podium

Bonsai

#19 v Small & Edge modely

prism-ml · od 2026-03-31 · 2× · naposledy 29. 6. 2026

Momentum

Bonsai-4B is an open-source 1-bit language model by PrismML based on the Qwen3 architecture, designed for deployment on edge devices such as iPhones, Macs, and CUDA GPUs. Weights are stored at 1.125 bits per parameter (1 sign bit plus one FP16 scale per group of 128 weights). The model is distributed in GGUF (Q1_0) and MLX 1-bit formats under the Apache 2.0 license. PrismML released it alongside Bonsai-8B and Bonsai-1.7B on March 31, 2026, when the company emerged from stealth.

Vývoj momenta

04.04.03.07.

Vlastnosti

Throughput (Tokens/Second)	Approx. 23 tokens/s on M1 MacBook Air (independent test); Bonsai-8B as reference: 44 tokens/s on iPhone 17 Pro Max; 80–100+ tokens/s on RTX 5060 Ti (reported for the 8B model)
Context Window	32,768 tokens (32K)
Model Size (Parameters)	4 billion parameters (architecture: Qwen3); GGUF file: 572 MB (Q1_0, 1.125 bpw)
Offline Capability	Fully offline-capable; runs locally on iPhone/iPad (via MLX Swift), Apple Silicon Macs (MLX), and CUDA and Metal GPUs (llama.cpp fork). No cloud access required.
Price Tier	Free / Open Source (Apache 2.0) – commercial use, modification, and redistribution permitted without restrictions
Memory Footprint (GB)	~0.5 GB (GGUF Q1_0 on disk: 0.57 GB incl. tokenizer/metadata; parameter memory excluding metadata even smaller); for comparison: unpacked/FP16 variant requires 8.1 GB VRAM

Bonsai

Vlastnosti

Zdroje (2)

Další produkty v této kategorii: Small & Edge modely

Subscribe free. Unsubscribe the second it sucks.