NVIDIA/gpu-operator
NVIDIA GPU Operator creates, configures, and manages GPUs in Kubernetes
GitHub repository with 2,742 stars and 516 forks.
Language: Go
Topics: cuda, gpu, kubernetes, nvidia
NVIDIA GPU Operator creates, configures, and manages GPUs in Kubernetes
GitHub repository with 2,742 stars and 516 forks.
Language: Go
Topics: cuda, gpu, kubernetes, nvidia
Trending score 1.83, freshness score 1.00, stars gained +7, forks gained +3.
2026-06-15: 2,742 stars and 516 forks.
NVIDIA GPU Operator creates, configures, and manages GPUs in Kubernetes
GitHub repository with 2,742 stars and 516 forks.
Trending score: 1.83; stars gained: +7; forks gained: +3.
Language: Go
Topics: cuda, gpu, kubernetes, nvidia
Auto-tuned launcher for GGUF models on llama.cpp / ik_llama.cpp — OpenAI-compatible server with multi-GPU tensor-split, MoE expert placement, measured flag tuning (AI Tune), hardware-matched HuggingFace downloads, and crash recovery. An Ollama alternative for multi-GPU rigs.
GitHub repository with 226 stars and 11 forks.
Trending score: 1.23; stars gained: +3; forks gained: +0.
Language: Go
Topics: cuda, gguf, llama-cpp, llm, metal, moe
eBPF based always-on CPU/GPU profiler auto-discovering targets in Kubernetes and systemd, zero code changes or restarts needed!
GitHub repository with 728 stars and 90 forks.
Trending score: 0.40; stars gained: +0; forks gained: +0.
Language: Go
Topics: ebpf, profiling, pprof, performance, kubernetes, observability
Local-first session search, analytics, insights, and token use statistics for coding agents, supporting Claude Code, Codex, and more than 20 other agents.
GitHub repository with 2,650 stars and 233 forks.
Trending score: 4.99; stars gained: +524; forks gained: +37.
Language: Go
Open-source & free — Battle-tested at Alibaba's scale. Hybrid architecture code review tool: deterministic pipelines + LLM Agent, precise line-level comments, built-in fine-tuned ruleset (NPE, thread-safety, XSS, SQL injection), OpenAI & Anthropic compatible.
GitHub repository with 7,216 stars and 421 forks.
Trending score: 4.97; stars gained: +315; forks gained: +24.
Language: Go
Topics: agent, code-review, code-review-assistant, harness, repository-level-context
DeepSeek-native AI coding agent for your terminal. Engineered around prefix-cache stability — leave it running.
GitHub repository with 22,195 stars and 1,334 forks.
Trending score: 4.91; stars gained: +265; forks gained: +16.
Language: Go
Topics: agent, agent-framework, ai-agent, ai-coding, cli, coding-agent
A unified AI model hub for aggregation & distribution. It supports cross-converting various LLMs into OpenAI-compatible, Claude-compatible, or Gemini-compatible formats. A centralized gateway for personal and enterprise model management. 🍥
GitHub repository with 38,908 stars and 8,838 forks.
Trending score: 4.77; stars gained: +261; forks gained: +62.
Language: Go
Topics: claude, gemini, openai, rerank, ai-gateway, deepseek
AI-native, free, open-source alternative to Jira, Trello, ClickUp & Monday. Built for Scrum teams where humans and AI agents collaborate as equals — on the same board, the same sprints, the same goals. Self-hosted. Fully customizable via config and plugins.
GitHub repository with 895 stars and 48 forks.
Trending score: 4.60; stars gained: +309; forks gained: +24.
Language: Go
Topics: ai-agent, bdd, clickup-alternative, jira-alternative, mcp, open-source
Wrap Gemini CLI, Antigravity, ChatGPT Codex, Claude Code, Grok Build as an OpenAI/Gemini/Claude/Codex compatible API service, allowing you to enjoy the free Gemini 3.1 Pro, GPT 5.5, Grok 4.3, Claude model through API
GitHub repository with 37,566 stars and 6,194 forks.
Trending score: 4.51; stars gained: +157; forks gained: +21.
Language: Go
Topics: antigravity, claude-code, cluade, codex, gemini, openai
LMCache: Supercharge Your LLM with the Fastest KV Cache Layer
GitHub repository with 9,093 stars and 1,322 forks.
Trending score: 4.67; stars gained: +411; forks gained: +26.
Language: Python
Topics: amd, cuda, fast, inference, kv-cache, llm
A high-throughput and memory-efficient inference and serving engine for LLMs
GitHub repository with 82,938 stars and 18,083 forks.
Trending score: 4.18; stars gained: +80; forks gained: +23.
Language: Python
Topics: amd, blackwell, cuda, deepseek, deepseek-v3, gpt
SGLang is a high-performance serving framework for large language models and multimodal models.
GitHub repository with 29,043 stars and 6,540 forks.
Trending score: 3.37; stars gained: +33; forks gained: +15.
Language: Python
Topics: attention, blackwell, cuda, deepseek, diffusion, glm
Fast LLM speculative inference server for consumer hardware.
GitHub repository with 2,503 stars and 229 forks.
Trending score: 2.88; stars gained: +27; forks gained: +6.
Language: C++
Topics: cuda, cuda-kernels, dflash, kernel, llama-cpp, local-ai
cuda-oxide is an experimental Rust-to-CUDA compiler that lets you write (SIMT) GPU kernels in safe(ish), idiomatic Rust. It compiles standard Rust code directly to PTX — no DSLs, no foreign language bindings, just Rust.
GitHub repository with 2,756 stars and 182 forks.
Trending score: 2.83; stars gained: +25; forks gained: +4.
Language: Rust
Topics: async, compiler-backend, cuda, gpu, heterogeneous-computing, high-performance-computing
TensorRT LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and supports state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT LLM also contains components to create Python and C++ runtimes that orchestrate the inference execution in a performant way.
GitHub repository with 13,881 stars and 2,465 forks.
Trending score: 2.24; stars gained: +7; forks gained: +2.
Language: Python
Topics: blackwell, cuda, llm-serving, moe, pytorch