Mgepahmge/cuDAO

Header-only CUDA runtime library for automatic dependency-aware kernel scheduling based on memory access semantics.

GitHub repository with 6 stars and 0 forks.

Language: Cuda

Topics: cpp, cpp17, cuda, cuda-driver-api, gpu, header-only, parallel-computing, scheduler

Open provider repository

24h trend summary

Trending score 0.55, freshness score 0.92, stars gained +1, forks gained +0.

Latest metric snapshot

2026-06-15: 6 stars and 0 forks.

Similar repositories

1. Mgepahmge/cuDAO

Header-only CUDA runtime library for automatic dependency-aware kernel scheduling based on memory access semantics.

GitHub repository with 6 stars and 0 forks.

Trending score: 0.55; stars gained: +1; forks gained: +0.

Language: Cuda

Topics: cpp, cpp17, cuda, cuda-driver-api, gpu, header-only
2. mesutoezdil/Systematic-CUDA-Learning

Personal CUDA learning repo, built step by step from scratch.

GitHub repository with 39 stars and 5 forks.

Trending score: 0.41; stars gained: +1; forks gained: +0.

Language: Cuda

Topics: cpp, cuda, gpu, gpu-computing, gpu-optimization, gpu-programming
3. kekzl/imp

From-scratch C++/CUDA LLM inference engine for the NVIDIA RTX 5090 (sm_120a). The fastest single-user inference on the 5090: faster decode than llama.cpp, at-or-ahead of vLLM on NVFP4, and the only engine running native NVFP4 on consumer Blackwell. 100% written by Claude Code.

GitHub repository with 19 stars and 2 forks.

Trending score: 0.24; stars gained: +0; forks gained: +0.

Language: Cuda

Topics: blackwell, cpp, cuda, fp4, gated-deltanet, gguf

Trending in Cuda

1. alibaba/rtp-llm

RTP-LLM: Alibaba's high-performance LLM inference engine for diverse applications.

GitHub repository with 1,223 stars and 214 forks.

Trending score: 2.00; stars gained: +6; forks gained: +4.

Language: Cuda

Topics: gpt, inference, llama, llm, llm-serving, llmops
2. mirage-project/mirage

Mirage Persistent Kernel: Compiling LLMs into a MegaKernel

GitHub repository with 2,313 stars and 220 forks.

Trending score: 1.87; stars gained: +6; forks gained: +1.

Language: Cuda
3. yassa9/dvlt.cu

Suckless no dependencies CUDA/C++ port of NVIDIA's DVLT, feed it images, get a 3D point cloud + camera poses. NO python. One fast 5MB binary.

GitHub repository with 53 stars and 8 forks.

Trending score: 0.93; stars gained: +1; forks gained: +0.

Language: Cuda

Topics: 3d, 3d-models, 3d-reconstruction, cuda, cuda-cpp, cuda-kernels
4. brucefan1983/GPUMD

Graphics Processing Units Molecular Dynamics

GitHub repository with 786 stars and 187 forks.

Trending score: 0.83; stars gained: +1; forks gained: -1.

Language: Cuda

Topics: molecular-dynamics-simulation, heat-transport, cuda, molecular-dynamics, gpumd, phonon
5. hemantsingh443/GPT2-inference

GPT inference in pure CUDA and C++

GitHub repository with 17 stars and 3 forks.

Trending score: 0.63; stars gained: +0; forks gained: +0.

Language: Cuda
6. Mgepahmge/cuDAO

Header-only CUDA runtime library for automatic dependency-aware kernel scheduling based on memory access semantics.

GitHub repository with 6 stars and 0 forks.

Trending score: 0.55; stars gained: +1; forks gained: +0.

Language: Cuda

Topics: cpp, cpp17, cuda, cuda-driver-api, gpu, header-only

Mgepahmge/cuDAO

24h trend summary

Latest metric snapshot

Similar repositories

1. Mgepahmge/cuDAO

2. mesutoezdil/Systematic-CUDA-Learning

3. kekzl/imp

Trending in Cuda

1. alibaba/rtp-llm

2. mirage-project/mirage

3. yassa9/dvlt.cu

4. brucefan1983/GPUMD

5. hemantsingh443/GPT2-inference

6. Mgepahmge/cuDAO

Trending topic: cpp

1. k2-fsa/sherpa-onnx

2. rocketride-org/rocketride-server

3. M2Team/NanaZip

4. ClickHouse/ClickHouse

5. kritishmohapatra/100_Days_100_IoT_Projects

6. slint-ui/slint