edwinweber/dbt_duckdb_demo_public

Open Source data engineering demo project using dbt, DuckDB, dlt, Dagster and Metabase. Two storage modes for the delta tables are supported: local and Microsoft Fabric Onelake.

GitHub repository with 14 stars and 0 forks.

Language: Python

Topics: data-engineering, dbt, delta-lake, dlt, duckdb, medallion-architecture, microsoft-fabric, motherduck, open-data, python

Open provider repository

Latest metric snapshot

2026-06-05: 14 stars and 0 forks.

Similar repositories

  1. 1. apache/airflow

    Apache Airflow - A platform to programmatically author, schedule, and monitor workflows

    GitHub repository with 45,795 stars and 17,231 forks.

    Trending score: 2.61; stars gained: +22; forks gained: +11.

    Language: Python

    Topics: airflow, apache, apache-airflow, automation, dag, data-engineering

  2. 2. airbytehq/airbyte

    Open-source data movement for ELT pipelines and AI agents โ€” from APIs, databases & files to warehouses, lakes, and AI applications. Both self-hosted and Cloud.

    GitHub repository with 21,447 stars and 5,218 forks.

    Trending score: 2.22; stars gained: +15; forks gained: +3.

    Language: Python

    Topics: bigquery, change-data-capture, data, data-analysis, data-collection, data-engineering

  3. 3. PrefectHQ/prefect

    Prefect is a workflow orchestration framework for building resilient data pipelines in Python.

    GitHub repository with 22,598 stars and 2,333 forks.

    Trending score: 1.97; stars gained: +14; forks gained: +1.

    Language: Python

    Topics: python, workflow, data-engineering, data-science, workflow-engine, prefect

  4. 4. feast-dev/feast

    The Open Source Feature Store for AI/ML

    GitHub repository with 7,092 stars and 1,343 forks.

    Trending score: 1.88; stars gained: +3; forks gained: +0.

    Language: Python

    Topics: big-data, data-engineering, data-quality, data-science, feature-store, features

  5. 5. benseverndev-oss/goldenmatch

    Zero-config entity resolution. The zero-tuning Fellegi-Sunter path beats hand-rolled Splink head-to-head; scales from a CSV to a verified 100M-row dedupe in 9.2 min on Ray. Fuzzy/exact/probabilistic + PPRL + LLM, identity graph. Python + edge-safe TypeScript (optional WASM), SQL-native in Postgres & DuckDB, MCP/REST + dbt/Airflow.

    GitHub repository with 108 stars and 10 forks.

    Trending score: 1.22; stars gained: +3; forks gained: +0.

    Language: Python

    Topics: active-learning, agent, airflow, auto-config, data-engineering, data-quality

  6. 6. mlrun/mlrun

    MLRun is an open source MLOps platform for quickly building and managing continuous ML applications across their lifecycle. MLRun integrates into your development and CI/CD environment and automates the delivery of production data, ML pipelines, and online applications.

    GitHub repository with 1,672 stars and 307 forks.

    Trending score: 0.99; stars gained: +1; forks gained: +0.

    Language: Python

    Topics: mlops, python, data-science, machine-learning, data-engineering, experiment-tracking

Trending in Python

  1. 1. mvanhorn/last30days-skill

    AI agent skill that researches any topic across Reddit, X, YouTube, HN, Polymarket, and the web - then synthesizes a grounded summary

    GitHub repository with 40,614 stars and 3,271 forks.

    Trending score: 5.82; stars gained: +1,312; forks gained: +87.

    Language: Python

  2. 2. chopratejas/headroom

    Compress tool outputs, logs, files, and RAG chunks before they reach the LLM. 60-95% fewer tokens, same answers. Library, proxy, MCP server.

    GitHub repository with 25,425 stars and 1,676 forks.

    Trending score: 5.73; stars gained: +2,844; forks gained: +202.

    Language: Python

    Topics: agent, ai, anthropic, compression, context-engineering, context-window

  3. 3. pewdiepie-archdaemon/odysseus

    Self-hosted AI workspace.

    GitHub repository with 69,675 stars and 8,820 forks.

    Trending score: 5.70; stars gained: +951; forks gained: +165.

    Language: Python

  4. 4. NousResearch/hermes-agent

    The agent that grows with you

    GitHub repository with 192,327 stars and 33,531 forks.

    Trending score: 5.48; stars gained: +990; forks gained: +282.

    Language: Python

    Topics: ai, ai-agent, ai-agents, anthropic, chatgpt, claude

  5. 5. safishamsi/graphify

    AI coding assistant skill (Claude Code, Codex, OpenCode, Cursor, Gemini CLI, and more). Turn any folder of code, SQL schemas, R scripts, shell scripts, docs, papers, images, or videos into a queryable knowledge graph. App code + database schema + infrastructure in one graph.

    GitHub repository with 66,467 stars and 6,719 forks.

    Trending score: 5.25; stars gained: +1,314; forks gained: +109.

    Language: Python

    Topics: antigravity, claude-code, codex, gemini, graphrag, knowledge-graph

  6. 6. hugohe3/ppt-master

    AI generates a real, editable PowerPoint from any document โ€” native shapes & animations, speaker notes voiced as audio narration, and the option to follow your own .pptx template, not slide images ยท by Hugo He

    GitHub repository with 27,112 stars and 2,418 forks.

    Trending score: 5.10; stars gained: +903; forks gained: +61.

    Language: Python

    Topics: ai-agent, aippt, office, powerpoint, powerpoint-generation, ppt

Trending topic: data-engineering

  1. 1. Kaelio/ktx

    ktx is an executable context layer for data and analytics agents ๐Ÿ™ Allow Claude Code, Codex, or other AI agents to query data accurately and with full context of your company

    GitHub repository with 1,179 stars and 62 forks.

    Trending score: 3.28; stars gained: +91; forks gained: +4.

    Language: TypeScript

    Topics: agent, agent-skills, agents, ai-agent, ai-agents, analytics

  2. 2. apache/superset

    Apache Superset is a Data Visualization and Data Exploration Platform

    GitHub repository with 73,270 stars and 17,602 forks.

    Trending score: 2.74; stars gained: +28; forks gained: +19.

    Language: TypeScript

    Topics: analytics, apache, apache-superset, asf, bi, business-analytics

  3. 3. lakehq/sail

    Drop-in Apache Spark replacement written in Rust, unifying batch processing, stream processing, and compute-intensive AI workloads.

    GitHub repository with 2,939 stars and 172 forks.

    Trending score: 2.69; stars gained: +25; forks gained: +2.

    Language: Rust

    Topics: apache-iceberg, apache-spark, arrow, artificial-intelligence, big-data, data-engineering

  4. 4. apache/airflow

    Apache Airflow - A platform to programmatically author, schedule, and monitor workflows

    GitHub repository with 45,795 stars and 17,231 forks.

    Trending score: 2.61; stars gained: +22; forks gained: +11.

    Language: Python

    Topics: airflow, apache, apache-airflow, automation, dag, data-engineering

  5. 5. SouravRoy-ETL/duckle

    Local-first ETL/ELT studio: a drag-and-drop visual pipeline designer that compiles to SQL and runs on DuckDB. Tiny desktop app, no servers, git-friendly workspaces.

    GitHub repository with 437 stars and 28 forks.

    Trending score: 2.61; stars gained: +27; forks gained: +0.

    Language: Rust

    Topics: data-engineering, data-integration, data-pipeline, data-quality, desktop-app, drag-and-drop

  6. 6. kestra-io/kestra

    Event Driven Orchestration & Scheduling Platform for Mission Critical Applications

    GitHub repository with 27,047 stars and 2,617 forks.

    Trending score: 2.56; stars gained: +38; forks gained: +2.

    Language: Java

    Topics: ai-agents, automation, control-plane, data-engineering, data-orchestration, data-orchestrator