Language

Synthszr Charts — die großen AI-Marken im Wettkampf ums Podium

DeepSeek-OCR

#22

deepseek · seit 2025-10-20 · 2× · zuletzt 30. Juni 2026

Momentum

DeepSeek-OCR is an open-source vision-language model by DeepSeek AI, released on October 20, 2025. It uses a technique called "Contexts Optical Compression" (COC), which compresses document pages into a small number of vision tokens rather than converting them into long text-token sequences. The architecture consists of the DeepEncoder (380 M parameters, combining SAM-Base and CLIP-Large with a 16× convolutional compressor) and the DeepSeek-3B-MoE decoder (3 B total parameters, ~570 M active per token). With vLLM inference on a single NVIDIA A100-40G it achieves approximately 2,500 tokens/second and up to 200,000 pages/day; model weights (~6.7 GB BF16) are available for free under the MIT license.

Historique du momentum

04.04.03.07.

Fonctionnalités

Latency (ms)	100–400 ms per page on A100 GPU (simple documents ~100 ms, complex documents with tables/charts ~400 ms)
Model Size (Parameter Count)	3B total parameters (DeepSeek-3B-MoE decoder: 3B total, ~570M active per token; DeepEncoder: ~380M parameters); weight file ~6.7 GB BF16
Price Tier	Open source / free: MIT-licensed weights, free self-hosting with no API fees. Third-party API (e.g., DeepInfra): $0.03/M input tokens and $0.10/M output tokens.
Language Support (Count)	100+ languages (trained on over 30M PDF pages in 100+ languages, incl. Latin, CJK, Cyrillic, and scientific scripts)
Processing Speed (x Realtime)	~2,500 tokens/second for PDF processing on an NVIDIA A100-40G (via vLLM); equivalent to >200,000 pages/day on an A100
Word Error Rate (%)	~3% (96%+ OCR decoding accuracy at 9–10× compression on the Fox benchmark; ~97% precision at <10× compression per arXiv paper and DigitalOcean docs; ~60% accuracy at 20× compression)

DeepSeek-OCR

Fonctionnalités

Preuves (2)

Subscribe free. Unsubscribe the second it sucks.