Mgepahmge/cuDAO

Header-only CUDA runtime library for automatic dependency-aware kernel scheduling based on memory access semantics.

GitHub repository with 6 stars and 0 forks.

Language: Cuda

Topics: cpp, cpp17, cuda, cuda-driver-api, gpu, header-only, parallel-computing, scheduler

Open provider repository

24h trend summary

Trending score 0.55, freshness score 0.92, stars gained +1, forks gained +0.

Latest metric snapshot

2026-06-15: 6 stars and 0 forks.

Similar repositories

  1. 1. Mgepahmge/cuDAO

    Header-only CUDA runtime library for automatic dependency-aware kernel scheduling based on memory access semantics.

    GitHub repository with 6 stars and 0 forks.

    Trending score: 0.55; stars gained: +1; forks gained: +0.

    Language: Cuda

    Topics: cpp, cpp17, cuda, cuda-driver-api, gpu, header-only

  2. 2. mesutoezdil/Systematic-CUDA-Learning

    Personal CUDA learning repo, built step by step from scratch.

    GitHub repository with 39 stars and 5 forks.

    Trending score: 0.41; stars gained: +1; forks gained: +0.

    Language: Cuda

    Topics: cpp, cuda, gpu, gpu-computing, gpu-optimization, gpu-programming

  3. 3. kekzl/imp

    From-scratch C++/CUDA LLM inference engine for the NVIDIA RTX 5090 (sm_120a). The fastest single-user inference on the 5090: faster decode than llama.cpp, at-or-ahead of vLLM on NVFP4, and the only engine running native NVFP4 on consumer Blackwell. 100% written by Claude Code.

    GitHub repository with 19 stars and 2 forks.

    Trending score: 0.24; stars gained: +0; forks gained: +0.

    Language: Cuda

    Topics: blackwell, cpp, cuda, fp4, gated-deltanet, gguf

Trending in Cuda

  1. 1. alibaba/rtp-llm

    RTP-LLM: Alibaba's high-performance LLM inference engine for diverse applications.

    GitHub repository with 1,223 stars and 214 forks.

    Trending score: 2.00; stars gained: +6; forks gained: +4.

    Language: Cuda

    Topics: gpt, inference, llama, llm, llm-serving, llmops

  2. 2. mirage-project/mirage

    Mirage Persistent Kernel: Compiling LLMs into a MegaKernel

    GitHub repository with 2,313 stars and 220 forks.

    Trending score: 1.87; stars gained: +6; forks gained: +1.

    Language: Cuda

  3. 3. yassa9/dvlt.cu

    Suckless no dependencies CUDA/C++ port of NVIDIA's DVLT, feed it images, get a 3D point cloud + camera poses. NO python. One fast 5MB binary.

    GitHub repository with 53 stars and 8 forks.

    Trending score: 0.93; stars gained: +1; forks gained: +0.

    Language: Cuda

    Topics: 3d, 3d-models, 3d-reconstruction, cuda, cuda-cpp, cuda-kernels

  4. 4. brucefan1983/GPUMD

    Graphics Processing Units Molecular Dynamics

    GitHub repository with 786 stars and 187 forks.

    Trending score: 0.83; stars gained: +1; forks gained: -1.

    Language: Cuda

    Topics: molecular-dynamics-simulation, heat-transport, cuda, molecular-dynamics, gpumd, phonon

  5. 5. hemantsingh443/GPT2-inference

    GPT inference in pure CUDA and C++

    GitHub repository with 17 stars and 3 forks.

    Trending score: 0.63; stars gained: +0; forks gained: +0.

    Language: Cuda

  6. 6. Mgepahmge/cuDAO

    Header-only CUDA runtime library for automatic dependency-aware kernel scheduling based on memory access semantics.

    GitHub repository with 6 stars and 0 forks.

    Trending score: 0.55; stars gained: +1; forks gained: +0.

    Language: Cuda

    Topics: cpp, cpp17, cuda, cuda-driver-api, gpu, header-only

Trending topic: cpp

  1. 1. k2-fsa/sherpa-onnx

    Speech-to-text, text-to-speech, speaker diarization, speech enhancement, source separation, and VAD using next-gen Kaldi with onnxruntime without Internet connection. Support embedded systems, Android, iOS, HarmonyOS, Raspberry Pi, RISC-V, RK NPU, Axera NPU, Ascend NPU, x86_64 servers, websocket server/client, support 12 programming languages

    GitHub repository with 12,989 stars and 1,486 forks.

    Trending score: 3.26; stars gained: +47; forks gained: +6.

    Language: C++

    Topics: aarch64, android, arm32, asr, cpp, csharp

  2. 2. rocketride-org/rocketride-server

    High-performance AI pipeline engine with a C++ core and 50+ Python-extensible nodes. Build, debug, and scale LLM workflows with 13+ model providers, 8+ vector databases, and agent orchestration, all from your IDE. Includes VS Code extension, TypeScript/Python SDKs, and Docker deployment.

    GitHub repository with 3,869 stars and 1,239 forks.

    Trending score: 3.03; stars gained: +60; forks gained: +14.

    Language: Python

    Topics: ai, cpp, data-pipeline, data-processing, machine-learning, mcp

  3. 3. M2Team/NanaZip

    The 7-Zip derivative intended for the modern Windows experience

    GitHub repository with 14,572 stars and 368 forks.

    Trending score: 2.89; stars gained: +55; forks gained: +1.

    Language: C++

    Topics: cpp, file-compression, file-manager, windows-10, windows-11, windows-desktop

  4. 4. ClickHouse/ClickHouse

    ClickHouse® is a real-time analytics database management system

    GitHub repository with 48,008 stars and 8,511 forks.

    Trending score: 2.67; stars gained: +11; forks gained: +4.

    Language: C++

    Topics: ai, analytics, big-data, clickhouse, cloud-native, cpp

  5. 5. kritishmohapatra/100_Days_100_IoT_Projects

    A 100-day challenge exploring IoT and embedded systems using ESP32, ESP8266, and Raspberry Pi Pico with MicroPython. Each day covers a new sensor or module with complete code, circuit diagram, and explanation.

    GitHub repository with 784 stars and 97 forks.

    Trending score: 2.64; stars gained: +19; forks gained: +6.

    Language: Python

    Topics: 100daysofcode, cpp, esp32, esp8266, iot, iot-application

  6. 6. slint-ui/slint

    Slint is an open-source declarative GUI toolkit to build native user interfaces for Rust, C++, JavaScript, or Python apps.

    GitHub repository with 22,897 stars and 906 forks.

    Trending score: 2.57; stars gained: +15; forks gained: +1.

    Language: Rust

    Topics: cpp, declarative-ui, desktop, embedded-devices, gui, javascript