New
Based on DeepSeek OCR 3B Model

DeepSeek OCR

AI-Powered Text Extraction | 97% Accuracy | Multi-language Markdown Output

Based on DeepSeek Vision Language Model, ultra-low token consumption, free and open source with self-hosting capability

97%
Accuracy
100
tokens/page
20万+
pages/day

Experience DeepSeek OCR Live

Upload images and experience the powerful capabilities of DeepSeek OCR in real-time

Upload Image and Configure

Or try sample images:

Recognition Result

Recognition results will be displayed here

OCR Model Comparison

Compare DeepSeek-OCR with other mainstream OCR solutions in key performance indicators such as accuracy, efficiency, and deployment characteristics

Model/ToolParameter ScaleCompression SupportAccuracyAdvantagesDisadvantages
🚀 DeepSeek-OCR (Recommended)3BYes97%Efficient, multi-language Markdown outputNon-deterministic, hardware dependent
📊 GOT-OCR 2.0约7BNo98%(无压缩)High fidelityHigh token consumption (60x)
📄 MinerU 2.0约10BNo95%Powerful PDF processingSlow speed (6000+ tokens/page)
⚡ PaddleOCR轻量级No90%Easy deploymentWeak structured output
💬 ChatGPT (GPT-4o)闭源No约85%(OCR受限)Easy to useShort context, rejects long documents

Frequently Asked Questions

Everything you need to know about DeepSeek OCR

DeepSeek OCR 与 Tesseract 和 PaddleOCR 相比如何?

DeepSeek OCR 使用视觉语言模型(VLM)进行上下文感知 OCR,而 Tesseract 和 PaddleOCR 是传统的模式匹配引擎。主要区别:准确率 97% vs 85%,Token 效率 100 tokens/页 vs 更高处理开销。

分辨率模式有什么区别?

分辨率模式在 token 消耗和准确率之间平衡:Tiny(64 tokens)- 简单文档;Small(100 tokens)- 推荐;Base(256 tokens)- 复杂布局;Large(400 tokens)- 高分辨率;Gundam - 学术论文。

DeepSeek OCR 真的是免费和开源的吗?

是的,100% 开源!3B 参数模型在 GitHub 和 Hugging Face 上提供,采用宽松许可。您可以自托管、修改模型、无许可费商业使用。

自托管的硬件要求是什么?

最低:8GB 显存(RTX 3070)用于基本推理;推荐:16GB+ 显存(RTX 4090、A100-40G)用于生产环境;企业级:多 GPU 配置处理 20 万+ 页/天。