GovHub-br/data-application-gov-hub

Pipeline de Dados do Gov-Hub

GitHub repository with 21 stars and 30 forks.

Language: Python

Topics: data-engineering, datascience, gov, government, government-data

Open provider repository

Latest metric snapshot

2026-06-15: 21 stars and 30 forks.

Similar repositories

  1. 1. apache/airflow

    Apache Airflow - A platform to programmatically author, schedule, and monitor workflows

    GitHub repository with 45,814 stars and 17,240 forks.

    Trending score: 2.84; stars gained: +16; forks gained: +8.

    Language: Python

    Topics: airflow, apache, apache-airflow, automation, dag, data-engineering

  2. 2. airbytehq/airbyte

    Open-source data movement for ELT pipelines and AI agents — from APIs, databases & files to warehouses, lakes, and AI applications. Both self-hosted and Cloud.

    GitHub repository with 21,457 stars and 5,220 forks.

    Trending score: 2.29; stars gained: +9; forks gained: +0.

    Language: Python

    Topics: data, pipeline, data-analysis, data-engineering, java, python

  3. 3. Zipstack/unstract

    LLM-Driven Extraction of Unstructured Data — Built for API Deployments & ETL Pipeline Workflows

    GitHub repository with 6,656 stars and 631 forks.

    Trending score: 1.90; stars gained: +8; forks gained: +1.

    Language: Python

    Topics: ai-agents, data-engineering, document-ai, generative-ai, idp, json-extraction

  4. 4. benseverndev-oss/goldenmatch

    Zero-config entity resolution. The zero-tuning Fellegi-Sunter path beats hand-rolled Splink head-to-head; scales from a CSV to a verified 100M-row dedupe in 9.2 min on Ray. Fuzzy/exact/probabilistic + PPRL + LLM, identity graph. Python + edge-safe TypeScript (optional WASM), SQL-native in Postgres & DuckDB, MCP/REST + dbt/Airflow.

    GitHub repository with 110 stars and 10 forks.

    Trending score: 1.07; stars gained: +1; forks gained: +0.

    Language: Python

    Topics: active-learning, agent, airflow, auto-config, data-engineering, data-quality

  5. 5. sophie-nguyenthuthuy/data-engineering

    100+ data engineering projects from scratch — streaming, CDC, table formats, query engines, consensus, governance. 2,500+ tests, mypy strict.

    GitHub repository with 46 stars and 22 forks.

    Trending score: 0.71; stars gained: +3; forks gained: +0.

    Language: Python

    Topics: cdc, data-engineering, delta-lake, iceberg, kafka, lsm-tree

  6. 6. debabsah/analytics-office

    A discipline harness for AI-assisted analytics: agent skills for every moment a number gets built, broken, or trusted — requirements, definitions, audits, triage, migrations, dashboards, briefs — every claim carrying its provenance in one living knowledge base.

    GitHub repository with 8 stars and 0 forks.

    Trending score: 0.58; stars gained: +2; forks gained: +0.

    Language: Python

    Topics: ai-agents, analytics, business-intelligence, claude-code, claude-code-plugin, data-engineering

Trending in Python

  1. 1. chopratejas/headroom

    Compress tool outputs, logs, files, and RAG chunks before they reach the LLM. 60-95% fewer tokens, same answers. Library, proxy, MCP server.

    GitHub repository with 27,902 stars and 1,891 forks.

    Trending score: 6.49; stars gained: +2,776; forks gained: +250.

    Language: Python

    Topics: agent, ai, anthropic, claude-code, compression, context-engineering

  2. 2. harry0703/MoneyPrinterTurbo

    利用AI大模型,一键生成高清短视频 Generate short videos with one click using AI LLM.

    GitHub repository with 87,926 stars and 12,612 forks.

    Trending score: 6.02; stars gained: +1,097; forks gained: +218.

    Language: Python

    Topics: ai, automation, chatgpt, moviepy, python, shortvideo

  3. 3. pewdiepie-archdaemon/odysseus

    Self-hosted AI workspace.

    GitHub repository with 71,291 stars and 9,086 forks.

    Trending score: 5.98; stars gained: +834; forks gained: +140.

    Language: Python

  4. 4. NousResearch/hermes-agent

    The agent that grows with you

    GitHub repository with 193,883 stars and 33,934 forks.

    Trending score: 5.92; stars gained: +753; forks gained: +209.

    Language: Python

    Topics: ai, ai-agent, ai-agents, anthropic, chatgpt, claude

  5. 5. NVIDIA/SkillSpector

    Security scanner for AI agent skills. Detect vulnerabilities, malicious patterns, and security risks.

    GitHub repository with 5,654 stars and 427 forks.

    Trending score: 5.61; stars gained: +874; forks gained: +76.

    Language: Python

  6. 6. rohitg00/ai-engineering-from-scratch

    Learn it. Build it. Ship it for others.

    GitHub repository with 32,527 stars and 5,342 forks.

    Trending score: 5.59; stars gained: +762; forks gained: +135.

    Language: Python

    Topics: agents, ai, ai-agents, ai-engineering, computer-vision, course

Trending topic: data-engineering

  1. 1. Kaelio/ktx

    ktx is an executable context layer for data and analytics agents 🐙 Allow Claude Code, Codex, or other AI agents to query data accurately and with full context of your company

    GitHub repository with 1,204 stars and 64 forks.

    Trending score: 3.00; stars gained: +21; forks gained: +2.

    Language: TypeScript

    Topics: agent, agent-skills, agents, ai-agent, ai-agents, analytics

  2. 2. weifuwan/seatunnel-web

    Modern SeaTunnel Web UI with visual DAG pipelines, batch & streaming sync, connector management, built-in metrics, and runtime logs.

    GitHub repository with 526 stars and 51 forks.

    Trending score: 2.92; stars gained: +22; forks gained: +3.

    Language: TypeScript

    Topics: batch, dag, data-engineering, data-integration, data-pipeline, etl

  3. 3. apache/airflow

    Apache Airflow - A platform to programmatically author, schedule, and monitor workflows

    GitHub repository with 45,814 stars and 17,240 forks.

    Trending score: 2.84; stars gained: +16; forks gained: +8.

    Language: Python

    Topics: airflow, apache, apache-airflow, automation, dag, data-engineering

  4. 4. apache/superset

    Apache Superset is a Data Visualization and Data Exploration Platform

    GitHub repository with 73,296 stars and 17,613 forks.

    Trending score: 2.83; stars gained: +16; forks gained: +7.

    Language: TypeScript

    Topics: analytics, apache, apache-superset, asf, bi, business-analytics

  5. 5. lakehq/sail

    Drop-in Apache Spark replacement written in Rust, unifying batch processing, stream processing, and compute-intensive AI workloads.

    GitHub repository with 2,955 stars and 172 forks.

    Trending score: 2.70; stars gained: +21; forks gained: +0.

    Language: Rust

    Topics: apache-iceberg, apache-spark, arrow, artificial-intelligence, big-data, data-engineering

  6. 6. kestra-io/kestra

    Event Driven Orchestration & Scheduling Platform for Mission Critical Applications

    GitHub repository with 27,058 stars and 2,621 forks.

    Trending score: 2.43; stars gained: +12; forks gained: +6.

    Language: Java

    Topics: ai-agents, automation, control-plane, data-engineering, data-orchestration, data-orchestrator