Synthszr Charts — die großen AI-Marken im Wettkampf ums Podium

DeepSeek V3

#3 v Reasoning modely

deepseek · v3 · od 2024-12-26 · 42× · naposledy 01. 7. 2026

Momentum

DeepSeek V3 is an open-source language model by DeepSeek, released on December 26, 2024. It is based on a Mixture-of-Experts (MoE) architecture with 671 billion total parameters, activating only 37 billion per token. The model was pre-trained on 14.8 trillion tokens and employs Multi-head Latent Attention (MLA) and FP8 training. It achieves benchmark performance comparable to leading proprietary models, particularly in mathematics, coding, and multilingual tasks.

Vývoj momenta

04.04.03.07.

Vlastnosti

Key Benchmark (%)	MMLU: 88.5% \| MATH-500: 90.2% \| GPQA: 59.1% \| Codeforces Percentile: 51.6% \| SWE-Bench Verified: 42.0%
Context Window (Tokens)	128,000 tokens
License	MIT License (code repository); DeepSeek Model License for model weights – commercial use allowed
Multimodality	No native multimodality – text-only. DeepSeek announced multimodal support as a future feature. Separate multimodal models exist as the standalone Janus series.
Platform	DeepSeek API (platform.deepseek.com, OpenAI-compatible endpoint); self-hosting via HuggingFace, SGLang, vLLM, TensorRT-LLM, LMDeploy, AMD GPU, Huawei Ascend NPU
Price	Free (open weights, self-hosting); API access via platform.deepseek.com paid per token
Price per 1M Tokens	$0.27 / 1M input tokens (cache miss), $0.07 / 1M input tokens (cache hit), $1.10 / 1M output tokens (original launch pricing)
Release Date	December 26, 2024

DeepSeek V3

Vlastnosti

Zdroje (42)

Další produkty v této kategorii: Reasoning modely

Subscribe free. Unsubscribe the second it sucks.