Yasuaki-Ito/GANSU

GPU Accelerated Numerical Simulation Utility for Quantum Chemistry

GitHub repository with 22 stars and 3 forks.

Language: Cuda

Topics: cuda, gpgpu, hartree-fock, post-hartree-fock, quantum-chemistry

Open provider repository

Latest metric snapshot

2026-06-05: 22 stars and 3 forks.

Similar repositories

1. lavawolfiee/mini-flash-attention

Minimal FlashAttention in CUDA C++/CuTe: readable WMMA/CuTe kernels, no NxN workspace, up to 4.5x faster than naive PyTorch

GitHub repository with 21 stars and 1 forks.

Trending score: 1.02; stars gained: +9; forks gained: +1.

Language: Cuda

Topics: attention, cuda, cute, cutlass, flash-attention, flashattention

Trending in Cuda

1. lavawolfiee/mini-flash-attention

Minimal FlashAttention in CUDA C++/CuTe: readable WMMA/CuTe kernels, no NxN workspace, up to 4.5x faster than naive PyTorch

GitHub repository with 21 stars and 1 forks.

Trending score: 1.02; stars gained: +9; forks gained: +1.

Language: Cuda

Topics: attention, cuda, cute, cutlass, flash-attention, flashattention
2. XopMC/brainflayer-CUDA

"brainflayer" CUDA & private key recovery tool

GitHub repository with 6 stars and 4 forks.

Trending score: 0.05; stars gained: +0; forks gained: +0.

Language: Cuda
3. gau-nernst/learn-cuda

Learn CUDA with PyTorch

GitHub repository with 313 stars and 49 forks.

Trending score: 0.04; stars gained: +0; forks gained: +0.

Language: Cuda
4. joesharratt1229/ThriftAttention

GitHub repository with 14 stars and 1 forks.

Trending score: 0.04; stars gained: +0; forks gained: +0.

Language: Cuda
5. AmrMSharafeldin/semanticfoam

This repository contains the official implementation of Semantic Foam: Unifying Spatial and Semantic Scene Decomposition

GitHub repository with 10 stars and 0 forks.

Trending score: 0.04; stars gained: +0; forks gained: +0.

Language: Cuda

Trending topic: cuda

1. vllm-project/vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

GitHub repository with 82,008 stars and 17,694 forks.

Trending score: 3.75; stars gained: +79; forks gained: +46.

Language: Python

Topics: amd, blackwell, cuda, deepseek, deepseek-v3, gpt
2. tenstorrent/tt-metal

:metal: TT-NN operator library, and TT-Metalium low level kernel programming model.

GitHub repository with 1,495 stars and 481 forks.

Trending score: 1.82; stars gained: +7; forks gained: +5.

Language: C++

Topics: accelerator, ai, cuda, deepseek, gpu, img-gen
3. AmmarkoV/SAM3DBody-cpp

Real-time 3D full-body reconstruction from a single camera, Multiperson BVH output, Pure C++ runtime, ONNX + ggml, 70-joint skeleton with hands.

GitHub repository with 475 stars and 62 forks.

Trending score: 1.78; stars gained: +2; forks gained: +1.

Language: C

Topics: 3d-human-pose, bvh, computer-vision, cpp, cuda, ggml
4. sgl-project/sglang

SGLang is a high-performance serving framework for large language models and multimodal models.

GitHub repository with 28,865 stars and 6,350 forks.

Trending score: 1.72; stars gained: -55; forks gained: +18.

Language: Python

Topics: attention, blackwell, cuda, deepseek, diffusion, glm
5. c0deJedi/nbd-vram

Use your NVIDIA GPU's VRAM as swap space on Linux. Built for laptops with soldered memory and no upgrade path. If you have an RTX card sitting there with 8GB of VRAM and you're getting swapped to SSD, this puts that VRAM to work

GitHub repository with 399 stars and 10 forks.

Trending score: 1.53; stars gained: +39; forks gained: +0.

Language: Shell

Topics: cuda, gpu, laptop, linux, memory, nbd
6. flashinfer-ai/flashinfer

FlashInfer: Kernel Library for LLM Serving

GitHub repository with 5,752 stars and 1,026 forks.

Trending score: 1.16; stars gained: +15; forks gained: +8.

Language: Python

Topics: attention, cuda, distributed-inference, gpu, jit, large-large-models