Scottcjn/exo-cuda

Exo distributed inference with NVIDIA CUDA support via tinygrad

GitHub repository with 80 stars and 11 forks.

Language: Python

Topics: cuda, distributed, exo, inference, llm, tinygrad

Open provider repository

Latest metric snapshot

2026-06-05: 80 stars and 11 forks.

Similar repositories

1. vllm-project/vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

GitHub repository with 82,001 stars and 17,690 forks.

Trending score: 3.75; stars gained: +79; forks gained: +46.

Language: Python

Topics: amd, blackwell, cuda, deepseek, deepseek-v3, gpt
2. sgl-project/sglang

SGLang is a high-performance serving framework for large language models and multimodal models.

GitHub repository with 28,862 stars and 6,348 forks.

Trending score: 1.72; stars gained: -55; forks gained: +18.

Language: Python

Topics: attention, blackwell, cuda, deepseek, diffusion, glm
3. flashinfer-ai/flashinfer

FlashInfer: Kernel Library for LLM Serving

GitHub repository with 5,752 stars and 1,026 forks.

Trending score: 1.16; stars gained: +15; forks gained: +8.

Language: Python

Topics: attention, cuda, distributed-inference, gpu, jit, large-large-models
4. PASSIONLab/OpenEquivariance

OpenEquivariance: a fast, open-source GPU JIT kernel generator for the Clebsch-Gordon Tensor Product.

GitHub repository with 149 stars and 9 forks.

Trending score: 0.45; stars gained: +1; forks gained: +0.

Language: Python

Topics: cuda, geometric-deep-learning, graph-neural-networks, sparse-tensors, equivariance, hip
5. hwdsl2/docker-kokoro

Docker image to run a self-hosted Kokoro TTS server with an OpenAI-compatible audio speech API. 50+ voices across 9 languages, streaming support, all major audio formats, NVIDIA GPU (CUDA) acceleration, offline mode, and persistent model cache. Multi-arch: amd64, arm64.

GitHub repository with 16 stars and 2 forks.

Trending score: 0.36; stars gained: +1; forks gained: +0.

Language: Python

Topics: openai, self-hosted, speech, text-to-speech, tts, speech-synthesis
6. notwitcheer/llm-bench-rig

Dual-engine (llama.cpp + vLLM) LLM benchmarking pipeline for GGUF & safetensors on NVIDIA GPUs — speed, quality, live dashboard, publishable cards.

GitHub repository with 9 stars and 2 forks.

Trending score: 0.33; stars gained: +1; forks gained: +0.

Language: Python

Topics: benchmarking, cuda, fastapi, gguf, llama-cpp, llm

Trending in Python

1. NousResearch/hermes-agent

The agent that grows with you

GitHub repository with 182,353 stars and 31,271 forks.

Trending score: 5.95; stars gained: +1,867; forks gained: +361.

Language: Python

Topics: ai, ai-agent, ai-agents, anthropic, chatgpt, claude
2. chopratejas/headroom

Compress tool outputs, logs, files, and RAG chunks before they reach the LLM. 60-95% fewer tokens, same answers. Library, proxy, MCP server.

GitHub repository with 14,053 stars and 885 forks.

Trending score: 5.69; stars gained: +2,829; forks gained: +175.

Language: Python

Topics: agent, ai, anthropic, compression, context-engineering, context-window
3. Imbad0202/academic-research-skills

Academic Research Skills for Claude Code: research → write → review → revise → finalize

GitHub repository with 27,548 stars and 2,267 forks.

Trending score: 5.52; stars gained: +1,079; forks gained: +89.

Language: Python

Topics: academic-pipeline, academic-writing, ai-research, claude, claude-code, literature-review
4. rohitg00/ai-engineering-from-scratch

Learn it. Build it. Ship it for others.

GitHub repository with 28,711 stars and 4,695 forks.

Trending score: 5.32; stars gained: +1,261; forks gained: +238.

Language: Python

Topics: agents, ai, ai-agents, ai-engineering, computer-vision, course
5. vinta/awesome-python

An opinionated list of Python frameworks, libraries, tools, and resources

GitHub repository with 301,427 stars and 28,046 forks.

Trending score: 4.60; stars gained: +518; forks gained: +24.

Language: Python

Topics: awesome, python, collections, python-frameworks, python-libraries, python-tools
6. Alishahryar1/free-claude-code

Use claude-code for free in the terminal, VSCode extension or discord like OpenClaw (voice supported)

GitHub repository with 32,539 stars and 4,943 forks.

Trending score: 4.56; stars gained: +467; forks gained: +82.

Language: Python

Scottcjn/exo-cuda

Latest metric snapshot

Similar repositories

1. vllm-project/vllm

2. sgl-project/sglang

3. flashinfer-ai/flashinfer

4. PASSIONLab/OpenEquivariance

5. hwdsl2/docker-kokoro

6. notwitcheer/llm-bench-rig

Trending in Python

1. NousResearch/hermes-agent

2. chopratejas/headroom

3. Imbad0202/academic-research-skills

4. rohitg00/ai-engineering-from-scratch

5. vinta/awesome-python

6. Alishahryar1/free-claude-code

Trending topic: cuda

1. vllm-project/vllm

2. tenstorrent/tt-metal

3. AmmarkoV/SAM3DBody-cpp

4. sgl-project/sglang

5. c0deJedi/nbd-vram

6. flashinfer-ai/flashinfer