opendata-lab/opendataworks

opendataworks 是一个面向大数据平台的统一数据门户系统,基于DolphinScheduler、Doris等开源项目,旨在为企业提供一站式的数据资产管理、任务调度编排和血缘关系追踪解决方案。

GitHub repository with 39 stars and 13 forks.

Language: Java

Topics: data-engineering, datax, dolphinscheduler, doris

Open provider repository

Latest metric snapshot

2026-06-05: 39 stars and 13 forks.

Similar repositories

  1. 1. DataSQRL/sqrl

    Agentic Data Engineering Harness for building data pipelines, data products, data APIs, and data lakes autonomously

    GitHub repository with 213 stars and 25 forks.

    Trending score: 0.09; stars gained: +0; forks gained: +0.

    Language: Java

    Topics: streaming, data-pipeline, api, event-driven, data-engineering, harness

  2. 2. opendatadiscovery/odd-platform

    First open-source data discovery and observability platform. We make a life for data practitioners easy so you can focus on your business.

    GitHub repository with 1,408 stars and 139 forks.

    Trending score: 0.04; stars gained: +0; forks gained: -1.

    Language: Java

    Topics: alerting, bigdata, data-catalog, data-discovery, data-engineering, data-exploration

Trending in Java

  1. 1. fish2018/webhtv

    WebHomeTV 基于FongMi二次开发,增强了 WebHome 自定义首页、App Native SDK、网盘链接检测 和 Nostr推荐首页。 这个项目的核心目标是让 CSP 站点首页可以变成一个真正可开发的网页应用:开发者可以用 HTML/CSS/JavaScript 定制首页,再通过 App 暴露的 Native 能力完成搜索、播放、跨域请求、资源代理、最近观看、网盘检测和状态同步。

    GitHub repository with 385 stars and 108 forks.

    Trending score: 3.29; stars gained: +83; forks gained: +16.

    Language: Java

  2. 2. juanjuandog/FinSight-AI

    AI equity research agent with resilient workflows, Redis Lua single-flight, pgvector RAG, versioned reports, evidence tracing, and RAG evaluation.

    GitHub repository with 1,003 stars and 58 forks.

    Trending score: 3.24; stars gained: +77; forks gained: +1.

    Language: Java

    Topics: ai-agent, financial-research, llm-evaluation, pgvector, postgresql, rabbitmq

  3. 3. apache/kafka

    Apache Kafka - A distributed event streaming platform

    GitHub repository with 32,714 stars and 15,251 forks.

    Trending score: 2.36; stars gained: +10; forks gained: +7.

    Language: Java

    Topics: scala, kafka, java, streaming

  4. 4. spring-projects/spring-ai

    An Application Framework for AI Engineering

    GitHub repository with 8,869 stars and 2,616 forks.

    Trending score: 2.31; stars gained: +22; forks gained: +5.

    Language: Java

    Topics: artificial-intelligence, hacktoberfest, java, spring-ai

  5. 5. JetBrains/intellij-community

    IntelliJ IDEA & IntelliJ Platform

    GitHub repository with 20,185 stars and 5,924 forks.

    Trending score: 2.29; stars gained: +16; forks gained: +4.

    Language: Java

    Topics: code-editor, ide, intellij, intellij-community, intellij-platform

  6. 6. apache/beam

    Apache Beam is a unified programming model for Batch and Streaming data processing.

    GitHub repository with 8,605 stars and 4,576 forks.

    Trending score: 2.18; stars gained: +5; forks gained: +4.

    Language: Java

    Topics: batch, beam, big-data, golang, java, python

Trending topic: data-engineering

  1. 1. Kaelio/ktx

    ktx is an executable context layer for data and analytics agents 🐙 Allow Claude Code, Codex, and any AI agent to query data accurately through MCP with skills, memory and a semantic layer

    GitHub repository with 895 stars and 46 forks.

    Trending score: 2.76; stars gained: +24; forks gained: +1.

    Language: TypeScript

    Topics: agent, agent-skills, agents, ai-agent, ai-agents, analytics

  2. 2. SouravRoy-ETL/duckle

    Local-first ETL/ELT studio: a drag-and-drop visual pipeline designer that compiles to SQL and runs on DuckDB. Tiny desktop app, no servers, git-friendly workspaces.

    GitHub repository with 284 stars and 20 forks.

    Trending score: 2.64; stars gained: +36; forks gained: +0.

    Language: Rust

    Topics: data-engineering, data-integration, data-pipeline, data-quality, desktop-app, drag-and-drop

  3. 3. weifuwan/seatunnel-web

    SeaTunnel Web is a visual tool for building and watching over your Apache SeaTunnel data pipelines, with drag-and-drop DAGs and simple connector setup.

    GitHub repository with 290 stars and 30 forks.

    Trending score: 1.85; stars gained: +57; forks gained: +5.

    Language: TypeScript

    Topics: batch, dag, data-engineering, data-integration, data-pipeline, etl

  4. 4. risingwavelabs/risingwave

    Event streaming platform for agentic AI. Continuously ingest, transform, and serve event streams in real time, at scale.

    GitHub repository with 9,065 stars and 776 forks.

    Trending score: 1.62; stars gained: +2; forks gained: +0.

    Language: Rust

    Topics: apache-iceberg, data-engineering, database, etl-pipeline, event-streaming, kafka

  5. 5. DataTalksClub/data-engineering-zoomcamp

    Data Engineering Zoomcamp is a free 9-week course on building production-ready data pipelines. The next cohort starts in January 2026. Join the course here 👇🏼

    GitHub repository with 41,902 stars and 8,303 forks.

    Trending score: 1.49; stars gained: +23; forks gained: +3.

    Language: Jupyter Notebook

    Topics: course, data-engineering, dbt, docker, free, kafka

  6. 6. dlt-hub/dlt

    data load tool (dlt) is an open source Python library that makes data loading easy 🛠️

    GitHub repository with 5,436 stars and 523 forks.

    Trending score: 0.98; stars gained: +8; forks gained: +0.

    Language: Python

    Topics: data, data-engineering, data-lake, data-loading, data-warehouse, elt