opendataloader-project/opendataloader-pdf

PDF Parser for AI-ready data. Automate PDF accessibility. Open-source.

GitHub repository with 23,663 stars and 2,205 forks.

Language: Java

Topics: json, markdown, pdf, ai, document-parsing, html, pdf-converter, tables, pdf-parser, rag

Open provider repository

24h trend summary

Trending score 2.84, activity score 1.17, stars gained +700, forks gained +69.

Latest metric snapshot

2026-06-04: 23,663 stars and 2,205 forks.

Similar repositories

  1. 1. opendataloader-project/opendataloader-pdf

    PDF Parser for AI-ready data. Automate PDF accessibility. Open-source.

    GitHub repository with 23,663 stars and 2,205 forks.

    Trending score: 2.84; stars gained: +700; forks gained: +69.

    Language: Java

    Topics: json, markdown, pdf, ai, document-parsing, html

  2. 2. apple/pkl

    A configuration as code language with rich validation and tooling.

    GitHub repository with 11,387 stars and 380 forks.

    Trending score: 0.69; stars gained: +4; forks gained: +0.

    Language: Java

    Topics: config, configuration, data, functional, java, json

  3. 3. stleary/JSON-java

    A reference implementation of a JSON package in Java.

    GitHub repository with 4,714 stars and 2,593 forks.

    Trending score: 0.28; stars gained: +0; forks gained: +0.

    Language: Java

    Topics: hacktoberfest, java, json, public-domain, hackoberfest2023

  4. 4. sirixdb/sirix

    SirixDB is an an embeddable, bitemporal, append-only database system and event store, storing immutable lightweight snapshots. It keeps the full history of each resource. Every commit stores a space-efficient snapshot through structural sharing. It is log-structured and never overwrites data. SirixDB uses a novel page-level versioning approach.

    GitHub repository with 1,186 stars and 247 forks.

    Trending score: 0.23; stars gained: +0; forks gained: +0.

    Language: Java

    Topics: xquery, java, temporal-data, storage, snapshot, comparison

  5. 5. c-eg/themoviedbapi

    A Java wrapper around the JSON API provided by TheMovieDB.org

    GitHub repository with 301 stars and 95 forks.

    Trending score: 0.22; stars gained: +0; forks gained: +0.

    Language: Java

    Topics: java, json, themoviedb, themoviedbapi, java-wrapper, wrapper-library

  6. 6. EwyBoy/SeedDrop

    Minecraft mod that lets you configure what drops from breaking grass with JSON or Commands.

    GitHub repository with 5 stars and 2 forks.

    Trending score: 0.08; stars gained: +0; forks gained: +0.

    Language: Java

    Topics: minecraft, seed, drop, forge, mod, modded

Trending in Java

  1. 1. Lucas0623z/NoteLite

    GitHub repository with 720 stars and 103 forks.

    Trending score: 2.91; stars gained: +47; forks gained: +5.

    Language: Java

  2. 2. opendataloader-project/opendataloader-pdf

    PDF Parser for AI-ready data. Automate PDF accessibility. Open-source.

    GitHub repository with 23,663 stars and 2,205 forks.

    Trending score: 2.84; stars gained: +700; forks gained: +69.

    Language: Java

    Topics: json, markdown, pdf, ai, document-parsing, html

  3. 3. fish2018/webhtv

    WebHomeTV 基于FongMi二次开发,增强了 WebHome 自定义首页、App Native SDK、网盘链接检测 和 Nostr推荐首页。 这个项目的核心目标是让 CSP 站点首页可以变成一个真正可开发的网页应用:开发者可以用 HTML/CSS/JavaScript 定制首页,再通过 App 暴露的 Native 能力完成搜索、播放、跨域请求、资源代理、最近观看、网盘检测和状态同步。

    GitHub repository with 350 stars and 105 forks.

    Trending score: 2.73; stars gained: +30; forks gained: +5.

    Language: Java

  4. 4. juanjuandog/FinSight-AI

    AI equity research agent with resilient workflows, Redis Lua single-flight, pgvector RAG, versioned reports, evidence tracing, and RAG evaluation.

    GitHub repository with 978 stars and 57 forks.

    Trending score: 2.63; stars gained: +20; forks gained: +5.

    Language: Java

    Topics: ai-agent, financial-research, llm-evaluation, pgvector, postgresql, rabbitmq

  5. 5. apache/kafka

    Apache Kafka - A distributed event streaming platform

    GitHub repository with 32,709 stars and 15,248 forks.

    Trending score: 2.24; stars gained: +6; forks gained: +1.

    Language: Java

    Topics: java, kafka, scala, streaming

  6. 6. apache/doris

    Apache Doris is an easy-to-use, high performance and unified analytics database.

    GitHub repository with 15,431 stars and 3,811 forks.

    Trending score: 2.23; stars gained: +5; forks gained: +0.

    Language: Java

    Topics: olap, database, hudi, iceberg, real-time, sql

Trending topic: json

  1. 1. opendataloader-project/opendataloader-pdf

    PDF Parser for AI-ready data. Automate PDF accessibility. Open-source.

    GitHub repository with 23,663 stars and 2,205 forks.

    Trending score: 2.84; stars gained: +700; forks gained: +69.

    Language: Java

    Topics: json, markdown, pdf, ai, document-parsing, html

  2. 2. redis/redis

    For developers, who are building real-time data-driven applications, Redis is the preferred, fastest, and most feature-rich cache, data structure server, and document and vector query engine.

    GitHub repository with 74,686 stars and 24,652 forks.

    Trending score: 1.32; stars gained: +23; forks gained: +9.

    Language: C

    Topics: cache, caching, database, distributed-systems, in-memory, in-memory-database

  3. 3. biomejs/biome

    A toolchain for web projects, aimed to provide functionalities to maintain them. Biome offers formatter and linter, usable via CLI and LSP.

    GitHub repository with 24,826 stars and 1,014 forks.

    Trending score: 1.10; stars gained: +13; forks gained: +2.

    Language: Rust

    Topics: css, formatter, javascript, jsx, linter, static-code-analysis

  4. 4. stephenberry/glaze

    Extremely fast, in memory, serialization, reflection, and RPC library for C++. JSON, BEVE, BSON, CBOR, CSV, JSONB, MessagePack, TOML, YAML, EETF

    GitHub repository with 2,829 stars and 247 forks.

    Trending score: 1.04; stars gained: +11; forks gained: +1.

    Language: C++

    Topics: api, beve, binary, bson, cbor, cplusplus

  5. 5. MariaDB/server

    MariaDB server is a community developed fork of MySQL server. Started by core members of the original MySQL team, MariaDB actively works with outside developers to deliver the most featureful, stable, and sanely licensed open SQL server in the industry.

    GitHub repository with 7,679 stars and 2,062 forks.

    Trending score: 0.88; stars gained: +5; forks gained: +1.

    Language: C++

    Topics: amazon-web-services, database, fulltext-search, galera, geographical-information-system, innodb

  6. 6. open-policy-agent/opa

    Open Policy Agent (OPA) is an open source, general-purpose policy engine.

    GitHub repository with 11,821 stars and 1,579 forks.

    Trending score: 0.82; stars gained: +6; forks gained: -1.

    Language: Go

    Topics: authorization, cloud-native, compliance, declarative, json, opa