linkedin/openhouse

Open Control Plane for Tables in Data Lakehouse

GitHub repository with 389 stars and 78 forks.

Language: Java

Topics: big-data, catalog, datalake, datalakehouse, declarative, iceberg, management, tables

Open provider repository

Latest metric snapshot

2026-06-10: 389 stars and 78 forks.

Similar repositories

  1. 1. trinodb/trino

    Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)

    GitHub repository with 12,931 stars and 3,661 forks.

    Trending score: 2.03; stars gained: +5; forks gained: -1.

    Language: Java

    Topics: analytics, big-data, data-science, database, databases, datalake

  2. 2. StarRocks/starrocks

    The world's fastest open query engine for sub-second analytics both on and off the data lakehouse. With the flexibility to support nearly any scenario, StarRocks provides best-in-class performance for multi-dimensional analytics, real-time analytics, and ad-hoc queries. A Linux Foundation project.

    GitHub repository with 11,786 stars and 2,449 forks.

    Trending score: 1.93; stars gained: +5; forks gained: +2.

    Language: Java

    Topics: analytics, big-data, cloudnative, database, datalake, delta-lake

  3. 3. apache/hive

    Apache Hive

    GitHub repository with 5,978 stars and 4,792 forks.

    Trending score: 1.05; stars gained: +1; forks gained: -1.

    Language: Java

    Topics: apache, big-data, database, hadoop, hive, java

  4. 4. apache/ignite

    Apache Ignite

    GitHub repository with 5,067 stars and 1,938 forks.

    Trending score: 0.91; stars gained: +2; forks gained: -1.

    Language: Java

    Topics: big-data, cache, cloud, data-management-platform, database, distributed-sql-database

  5. 5. apache/paimon

    Apache Paimon is a lake format that enables building a Realtime Lakehouse Architecture with Flink and Spark for both streaming and batch operations.

    GitHub repository with 3,299 stars and 1,336 forks.

    Trending score: 0.81; stars gained: +0; forks gained: +2.

    Language: Java

    Topics: big-data, data-ingestion, flink, paimon, real-time-analytics, spark

  6. 6. apache/fluss

    Apache Fluss is a streaming storage built for real-time analytics.

    GitHub repository with 1,940 stars and 561 forks.

    Trending score: 0.78; stars gained: +0; forks gained: +0.

    Language: Java

    Topics: streaming, fluss, lakehouse, real-time-analytics, big-data, hacktoberfest

Trending in Java

  1. 1. opendataloader-project/opendataloader-pdf

    PDF Parser for AI-ready data. Automate PDF accessibility. Open-source.

    GitHub repository with 25,066 stars and 2,364 forks.

    Trending score: 4.94; stars gained: +514; forks gained: +54.

    Language: Java

    Topics: a11y, accessibility, ai, bounding-box, document-parsing, eaa

  2. 2. NationalSecurityAgency/ghidra

    Ghidra is a software reverse engineering (SRE) framework

    GitHub repository with 69,674 stars and 7,648 forks.

    Trending score: 3.84; stars gained: +105; forks gained: +11.

    Language: Java

    Topics: disassembler, reverse-engineering, software-analysis

  3. 3. agentscope-ai/agentscope-java

    AgentScope Java: Agent-Oriented Programming for Building LLM Applications

    GitHub repository with 3,837 stars and 819 forks.

    Trending score: 3.82; stars gained: +104; forks gained: +22.

    Language: Java

    Topics: agent, agentic, agentic-ai, ai, llm

  4. 4. alibaba/spring-ai-alibaba

    Agentic AI Framework for Java Developers

    GitHub repository with 10,020 stars and 2,232 forks.

    Trending score: 3.45; stars gained: +80; forks gained: +23.

    Language: Java

    Topics: artificial-intelligence, java, spring-ai, agentic, context-engineering, multi-agent

  5. 5. bethington/ghidra-mcp

    Ghidra MCP Server — 200+ MCP tools for AI-powered reverse engineering. GUI plugin + headless server, lazy tool loading, convention enforcement, batch operations, Ghidra Server integration, and Docker deployment.

    GitHub repository with 2,440 stars and 32 forks.

    Trending score: 3.42; stars gained: +86; forks gained: +6.

    Language: Java

    Topics: binary-analysis, ghidra, java, mcp, model-context-protocol, python

  6. 6. halo-dev/halo

    Halo 是一款强大易用的开源建站工具,从个人博客、知识库,到企业官网、在线商城,Halo 都能助您轻松实现,一站式满足您的多样化建站需求。

    GitHub repository with 39,039 stars and 10,297 forks.

    Trending score: 3.32; stars gained: +60; forks gained: +9.

    Language: Java

    Topics: halo, cms, halocms, content-management-system, blog, blog-engine

Trending topic: big-data

  1. 1. lakehq/sail

    Drop-in Apache Spark replacement written in Rust, unifying batch processing, stream processing, and compute-intensive AI workloads.

    GitHub repository with 2,964 stars and 173 forks.

    Trending score: 2.70; stars gained: +21; forks gained: +0.

    Language: Rust

    Topics: apache-iceberg, apache-spark, arrow, artificial-intelligence, big-data, data-engineering

  2. 2. ClickHouse/ClickHouse

    ClickHouse® is a real-time analytics database management system

    GitHub repository with 48,009 stars and 8,511 forks.

    Trending score: 2.67; stars gained: +11; forks gained: +4.

    Language: C++

    Topics: ai, analytics, big-data, clickhouse, cloud-native, cpp

  3. 3. apache/spark

    Apache Spark - A unified analytics engine for large-scale data processing

    GitHub repository with 43,456 stars and 29,225 forks.

    Trending score: 2.10; stars gained: +9; forks gained: +4.

    Language: Scala

    Topics: big-data, java, jdbc, python, r, scala

  4. 4. trinodb/trino

    Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)

    GitHub repository with 12,931 stars and 3,661 forks.

    Trending score: 2.03; stars gained: +5; forks gained: -1.

    Language: Java

    Topics: analytics, big-data, data-science, database, databases, datalake

  5. 5. StarRocks/starrocks

    The world's fastest open query engine for sub-second analytics both on and off the data lakehouse. With the flexibility to support nearly any scenario, StarRocks provides best-in-class performance for multi-dimensional analytics, real-time analytics, and ad-hoc queries. A Linux Foundation project.

    GitHub repository with 11,786 stars and 2,449 forks.

    Trending score: 1.93; stars gained: +5; forks gained: +2.

    Language: Java

    Topics: analytics, big-data, cloudnative, database, datalake, delta-lake

  6. 6. feast-dev/feast

    The Open Source Feature Store for AI/ML

    GitHub repository with 7,092 stars and 1,344 forks.

    Trending score: 1.70; stars gained: +3; forks gained: +3.

    Language: Python

    Topics: machine-learning, features, ml, big-data, feature-store, python