PaddlePaddle/PaddleOCR

Turn any PDF or image document into structured data for your AI. A powerful, lightweight OCR toolkit that bridges the gap between images/PDFs and LLMs. Supports 100+ languages.

GitHub repository with 80,110 stars and 10,614 forks.

Language: Python

Topics: ocr, chineseocr, pdf2markdown, pp-ocr, pp-structure, document-parsing, document-translation, kie, ai4science, pdf-extractor-rag

Open provider repository

24h trend summary

Trending score 2.47, activity score 0.05, stars gained +425, forks gained +38.

Latest metric snapshot

2026-06-05: 80,110 stars and 10,614 forks.

Similar repositories

1. opendatalab/MinerU

Transforms complex documents like PDFs and Office docs into LLM-ready markdown/JSON for your Agentic workflows.

GitHub repository with 66,498 stars and 5,607 forks.

Trending score: 2.60; stars gained: +331; forks gained: +32.

Language: Python

Topics: ai4science, document-analysis, docx, extract-data, layout-analysis, ocr
2. PaddlePaddle/PaddleOCR

Turn any PDF or image document into structured data for your AI. A powerful, lightweight OCR toolkit that bridges the gap between images/PDFs and LLMs. Supports 100+ languages.

GitHub repository with 80,110 stars and 10,614 forks.

Trending score: 2.47; stars gained: +425; forks gained: +38.

Language: Python

Topics: ocr, chineseocr, pdf2markdown, pp-ocr, pp-structure, document-parsing
3. thomaswantstobeaskeleton/BallonsTranslator-Pro

基于 BallonsTranslator 的漫画/网漫/韩漫/国漫计算机辅助翻译工具，扩展了 OCR、文本检测、|修复、工作流程、字体和导出选项。Computer-aided manga/comic/manhwa/manhua translation tool based on BallonsTranslator, with expanded OCR, detection, inpainting, workflow, font, and export options.

GitHub repository with 68 stars and 6 forks.

Trending score: 0.44; stars gained: +1; forks gained: +0.

Language: Python

Topics: inpainting, manga, ocr, auto-translation, chinese-translation, comics
4. run-llama/llama-parse-py

Python SDK for OCR and document parsing in the cloud with LlamaParse

GitHub repository with 38 stars and 9 forks.

Trending score: 0.24; stars gained: +0; forks gained: +0.

Language: Python

Topics: agent, agents, document-agent, document-processing, information-extraction, ocr
5. alephpi/Texo

A minimalist SOTA LaTeX OCR model with only 20M parameters, running in browser. Full training pipeline available for self-reproduction. | 超轻量SOTA LaTeX公式识别模型，仅20M参数量，可在浏览器中运行。训练全流程代码开源，以便自学复现。

GitHub repository with 826 stars and 47 forks.

Trending score: 0.09; stars gained: +0; forks gained: +0.

Language: Python

Topics: computer-vision, deep-learning, distillation-model, hydra, latex-ocr, machine-learning
6. neuronaline/flask-ai-agent-studio

A self-hosted Flask AI assistant with RAG, vision, multi-tool execution, and canvas document editing. Full workflow automation in one open-source platform.

GitHub repository with 6 stars and 0 forks.

Trending score: 0.04; stars gained: +0; forks gained: +0.

Language: Python

Topics: ai-agent, ai-assistant, chatbot, chromadb, deepseek, flask

Trending in Python

1. NousResearch/hermes-agent

The agent that grows with you

GitHub repository with 181,471 stars and 31,147 forks.

Trending score: 5.95; stars gained: +1,867; forks gained: +361.

Language: Python

Topics: ai, ai-agent, ai-agents, anthropic, chatgpt, claude
2. chopratejas/headroom

Compress tool outputs, logs, files, and RAG chunks before they reach the LLM. 60-95% fewer tokens, same answers. Library, proxy, MCP server.

GitHub repository with 12,942 stars and 833 forks.

Trending score: 5.69; stars gained: +2,829; forks gained: +175.

Language: Python

Topics: agent, ai, anthropic, claude-code, compression, context-engineering
3. Imbad0202/academic-research-skills

Academic Research Skills for Claude Code: research → write → review → revise → finalize

GitHub repository with 27,389 stars and 2,252 forks.

Trending score: 5.52; stars gained: +1,079; forks gained: +89.

Language: Python

Topics: academic-pipeline, academic-writing, ai-research, claude, claude-code, literature-review
4. anthropics/financial-services

GitHub repository with 30,002 stars and 4,224 forks.

Trending score: 4.88; stars gained: +688; forks gained: +114.

Language: Python
5. virgiliojr94/book-to-skill

Turn any technical book PDF into a Claude Code skill — ready to study, reference, and use while you work.

GitHub repository with 4,221 stars and 528 forks.

Trending score: 4.88; stars gained: +476; forks gained: +68.

Language: Python
6. vinta/awesome-python

An opinionated list of Python frameworks, libraries, tools, and resources

GitHub repository with 301,341 stars and 28,044 forks.

Trending score: 4.60; stars gained: +518; forks gained: +24.

Language: Python

Topics: awesome, python, collections, python-frameworks, python-libraries, python-tools

PaddlePaddle/PaddleOCR

24h trend summary

Latest metric snapshot

Similar repositories

1. opendatalab/MinerU

2. PaddlePaddle/PaddleOCR

3. thomaswantstobeaskeleton/BallonsTranslator-Pro

4. run-llama/llama-parse-py

5. alephpi/Texo

6. neuronaline/flask-ai-agent-studio

Trending in Python

1. NousResearch/hermes-agent

2. chopratejas/headroom

3. Imbad0202/academic-research-skills

4. anthropics/financial-services

5. virgiliojr94/book-to-skill

6. vinta/awesome-python

Trending topic: ocr

1. opendatalab/MinerU

2. PaddlePaddle/PaddleOCR

3. shipfastlabs/parsel

4. run-llama/liteparse

5. marco-beltrame/WinLens

6. itmisx/deepx-code