supranational/sppark
Zero-knowledge template library
GitHub repository with 219 stars and 97 forks.
Language: Cuda
Topics: cuda, bls12-377, bls12-381, pasta-curves, zero-knowledge, zero-knowledge-proofs, zk-snarks, zk-starks, ntt, rocm
Zero-knowledge template library
GitHub repository with 219 stars and 97 forks.
Language: Cuda
Topics: cuda, bls12-377, bls12-381, pasta-curves, zero-knowledge, zero-knowledge-proofs, zk-snarks, zk-starks, ntt, rocm
Trending score 0.18, activity score 0.93, stars gained +0, forks gained +1.
2026-06-05: 219 stars and 97 forks.
Minimal FlashAttention in CUDA C++/CuTe: readable WMMA/CuTe kernels, no NxN workspace, up to 4.5x faster than naive PyTorch
GitHub repository with 21 stars and 1 forks.
Trending score: 1.02; stars gained: +9; forks gained: +1.
Language: Cuda
Topics: attention, cuda, cute, cutlass, flash-attention, flashattention
CUDA Library Samples
GitHub repository with 2,424 stars and 459 forks.
Trending score: 0.79; stars gained: +5; forks gained: +1.
Language: Cuda
Topics: cufft, curand, cusolver, cusparse, nvjpeg, cudss
Graphics Processing Units Molecular Dynamics
GitHub repository with 782 stars and 186 forks.
Trending score: 0.69; stars gained: +4; forks gained: +2.
Language: Cuda
Topics: molecular-dynamics-simulation, heat-transport, cuda, molecular-dynamics, gpumd, phonon
Zero-knowledge template library
GitHub repository with 219 stars and 97 forks.
Trending score: 0.18; stars gained: +0; forks gained: +1.
Language: Cuda
Topics: cuda, bls12-377, bls12-381, pasta-curves, zero-knowledge, zero-knowledge-proofs
Minimal FlashAttention in CUDA C++/CuTe: readable WMMA/CuTe kernels, no NxN workspace, up to 4.5x faster than naive PyTorch
GitHub repository with 21 stars and 1 forks.
Trending score: 1.02; stars gained: +9; forks gained: +1.
Language: Cuda
Topics: attention, cuda, cute, cutlass, flash-attention, flashattention
CUDA Library Samples
GitHub repository with 2,424 stars and 459 forks.
Trending score: 0.79; stars gained: +5; forks gained: +1.
Language: Cuda
Topics: cufft, curand, cusolver, cusparse, nvjpeg, cudss
Graphics Processing Units Molecular Dynamics
GitHub repository with 782 stars and 186 forks.
Trending score: 0.69; stars gained: +4; forks gained: +2.
Language: Cuda
Topics: molecular-dynamics-simulation, heat-transport, cuda, molecular-dynamics, gpumd, phonon
Zero-knowledge template library
GitHub repository with 219 stars and 97 forks.
Trending score: 0.18; stars gained: +0; forks gained: +1.
Language: Cuda
Topics: cuda, bls12-377, bls12-381, pasta-curves, zero-knowledge, zero-knowledge-proofs
A high perfromance op kernel lib running on metax HD platform
GitHub repository with 6 stars and 3 forks.
Trending score: 0.05; stars gained: +0; forks gained: +0.
Language: Cuda
"brainflayer" CUDA & private key recovery tool
GitHub repository with 6 stars and 4 forks.
Trending score: 0.05; stars gained: +0; forks gained: +0.
Language: Cuda
A high-throughput and memory-efficient inference and serving engine for LLMs
GitHub repository with 81,996 stars and 17,679 forks.
Trending score: 3.75; stars gained: +79; forks gained: +46.
Language: Python
Topics: amd, blackwell, cuda, deepseek, deepseek-v3, gpt
A GPU cluster manager that configures and orchestrates inference engines like vLLM and SGLang for high-performance AI model deployment.
GitHub repository with 5,107 stars and 542 forks.
Trending score: 2.51; stars gained: +11; forks gained: +1.
Language: Python
Topics: ascend, cuda, deepseek, distributed-inference, genai, high-performance-inference
Fast LLM speculative inference server for consumer hardware.
GitHub repository with 2,332 stars and 217 forks.
Trending score: 2.31; stars gained: +17; forks gained: +3.
Language: C++
Topics: kernel, llama-cpp, local-ai, nvidia-cuda, qwen, rtx3090
LMCache: Supercharge Your LLM with the Fastest KV Cache Layer
GitHub repository with 8,422 stars and 1,246 forks.
Trending score: 2.17; stars gained: +11; forks gained: +6.
Language: Python
Topics: amd, cuda, fast, inference, kv-cache, llm
Making it easier to work with shaders
GitHub repository with 5,349 stars and 451 forks.
Trending score: 2.08; stars gained: +4; forks gained: +2.
Language: C++
Topics: shaders, hlsl, glsl, d3d12, vulkan, cuda
:metal: TT-NN operator library, and TT-Metalium low level kernel programming model.
GitHub repository with 1,494 stars and 480 forks.
Trending score: 1.82; stars gained: +7; forks gained: +5.
Language: C++
Topics: accelerator, ai, cuda, deepseek, gpu, img-gen