

Groq
#1 in KI-Inferenz-Hardwaregroq · siet 2024-02-19 (Soft-Launch GroqCloud Developer Platform) · 15× · tolest 02. Juli 2026
Groq is a US-based company that offers the LPU (Language Processing Unit), a chip purpose-built for AI inference, along with its GroqCloud platform. The LPU relies on large on-chip SRAM instead of external memory, deterministic statically-scheduled execution, and a custom compiler stack to achieve low latency and high throughput for language model inference. The hardware is provided via GroqCloud (pay-per-token API) and GroqRack clusters for on-premise deployment; in December 2025 Groq also announced a multibillion-dollar non-exclusive licensing agreement with Nvidia for its LPU technology.
Features
| Manufacturing Process (nm) | Current generation: GlobalFoundries 14nm; next generation: Samsung SF4X 4nm process |
| License | Proprietary hardware/cloud services (Groq Services Agreement); hosted models are mostly open-source (e.g., Llama) with their own licenses; Dec 2025 non-exclusive technology license to Nvidia |
| Platform | GroqCloud (on-demand public cloud, private/co-cloud) and GroqRack compute clusters for on-prem deployment |
| Price | API from $0.05/1M input tokens (Llama 3.1 8B) up to $0.59/1M input tokens (Llama 3.3 70B); output up to $0.79/1M tokens; batch API 50% cheaper |
| Compute Performance (FLOPS/TOPS) | 1st generation (TSP, 14nm): >1 TeraOp/s per mm² of silicon at 900 MHz clock speed |
| Release Date | GroqCloud Developer Platform soft launch: February 19, 2024 |
| Memory | Up to 230 MB SRAM per chip (current generation); new generation (Groq 3 LPU) 500 MB SRAM with 150 TB/s bandwidth |
| Availability | Publicly available via GroqCloud API (Free, Developer, and Enterprise tiers); GroqRack for enterprise customers upon request |