kreuzberg-dev/kreuzberg
A polyglot document intelligence framework with a Rust core. Extract text, metadata, images, and structured information from PDFs, Office documents, images, and 97+ formats. Available for Rust, Python, Ruby, Java, Go, PHP, Elixir, C#, R, C, TypeScript (Node/Bun/Wasm/Deno)- or use via CLI, REST API, or MCP server.
GitHub repository with 8,441 stars and 497 forks.
Language: Rust
Topics: bun, csharp, document-intelligence, elixir, ffi, golang, java, metadata-extraction, node, pdf-extraction