Alvov1/Aesi-Multiprecision
Static-sized long-precision arithmetic library for use inside GPU parallelization with CUDA
GitHub repository with 14 stars and 1 forks.
Language: C++
Topics: cuda, gpu-acceleration, long-arithmetics, multiprecision
Static-sized long-precision arithmetic library for use inside GPU parallelization with CUDA
GitHub repository with 14 stars and 1 forks.
Language: C++
Topics: cuda, gpu-acceleration, long-arithmetics, multiprecision
2026-06-05: 14 stars and 1 forks.
:metal: TT-NN operator library, and TT-Metalium low level kernel programming model.
GitHub repository with 1,495 stars and 481 forks.
Trending score: 1.82; stars gained: +7; forks gained: +5.
Language: C++
Topics: accelerator, ai, cuda, deepseek, gpu, img-gen
CUDA Templates and Python DSLs for High-Performance Linear Algebra
GitHub repository with 9,844 stars and 1,892 forks.
Trending score: 1.04; stars gained: +11; forks gained: +3.
Language: C++
Topics: cuda, deep-learning, deep-learning-library, cpp, nvidia, gpu
ALIEN is a CUDA-powered artificial life simulation program.
GitHub repository with 5,423 stars and 185 forks.
Trending score: 0.98; stars gained: +1; forks gained: +0.
Language: C++
Topics: artificial-life, open-ended-evolution, agent-based-simulation, physics-engine, cuda
CUDA Core Compute Libraries
GitHub repository with 2,366 stars and 401 forks.
Trending score: 0.57; stars gained: +2; forks gained: +1.
Language: C++
Topics: accelerated-computing, cpp, cpp-programming, cuda, cuda-cpp, cuda-kernels
A fast, scalable, high performance Gradient Boosting on Decision Trees library, used for ranking, classification, regression and other machine learning tasks for Python, R, Java, C++. Supports computation on CPU and GPU.
GitHub repository with 8,972 stars and 1,297 forks.
Trending score: 0.49; stars gained: +2; forks gained: +0.
Language: C++
Topics: machine-learning, decision-trees, gradient-boosting, gbm, gbdt, python
Ultra high-performance secp256k1 ECC engine | Python, Node.js, Rust, Go, C#, Swift, Java bindings | CUDA, Metal, OpenCL GPU | ECDSA, Schnorr, FROST, MuSig2, BIP-352 | 15+ platforms
GitHub repository with 40 stars and 18 forks.
Trending score: 0.33; stars gained: +1; forks gained: +2.
Language: C++
Topics: android, arm64, bitcoin, constant-time, cryptography, cuda
LLM inference in C/C++
GitHub repository with 114,816 stars and 19,210 forks.
Trending score: 4.40; stars gained: +304; forks gained: +99.
Language: C++
Topics: ggml
DuckDB is an analytical in-process SQL database management system
GitHub repository with 38,625 stars and 3,301 forks.
Trending score: 3.50; stars gained: +40; forks gained: +6.
Language: C++
Topics: analytics, database, embedded-database, olap, sql
Community maintained hardware plugin for vLLM on Ascend
GitHub repository with 2,201 stars and 1,350 forks.
Trending score: 3.25; stars gained: +16; forks gained: +22.
Language: C++
Topics: ascend, inference, llm, llm-serving, llmops, mlops
ClickHouse® is a real-time analytics database management system
GitHub repository with 47,843 stars and 8,471 forks.
Trending score: 2.96; stars gained: +53; forks gained: +10.
Language: C++
Topics: ai, analytics, big-data, clickhouse, cloud-native, cpp
Truly independent web browser
GitHub repository with 63,807 stars and 3,076 forks.
Trending score: 2.89; stars gained: +52; forks gained: +5.
Language: C++
Topics: browser, browser-engine
Distribute and run LLMs with a single file.
GitHub repository with 24,664 stars and 1,370 forks.
Trending score: 2.66; stars gained: +38; forks gained: +4.
Language: C++
Topics: cross-platform, gguf, llama-cpp, local-ai, local-inference, local-llm
A high-throughput and memory-efficient inference and serving engine for LLMs
GitHub repository with 82,006 stars and 17,693 forks.
Trending score: 3.75; stars gained: +79; forks gained: +46.
Language: Python
Topics: amd, blackwell, cuda, deepseek, deepseek-v3, gpt
:metal: TT-NN operator library, and TT-Metalium low level kernel programming model.
GitHub repository with 1,495 stars and 481 forks.
Trending score: 1.82; stars gained: +7; forks gained: +5.
Language: C++
Topics: accelerator, ai, cuda, deepseek, gpu, img-gen
Real-time 3D full-body reconstruction from a single camera, Multiperson BVH output, Pure C++ runtime, ONNX + ggml, 70-joint skeleton with hands.
GitHub repository with 475 stars and 62 forks.
Trending score: 1.78; stars gained: +2; forks gained: +1.
Language: C
Topics: 3d-human-pose, bvh, computer-vision, cpp, cuda, ggml
SGLang is a high-performance serving framework for large language models and multimodal models.
GitHub repository with 28,865 stars and 6,350 forks.
Trending score: 1.72; stars gained: -55; forks gained: +18.
Language: Python
Topics: attention, blackwell, cuda, deepseek, diffusion, glm
Use your NVIDIA GPU's VRAM as swap space on Linux. Built for laptops with soldered memory and no upgrade path. If you have an RTX card sitting there with 8GB of VRAM and you're getting swapped to SSD, this puts that VRAM to work
GitHub repository with 399 stars and 10 forks.
Trending score: 1.53; stars gained: +39; forks gained: +0.
Language: Shell
Topics: cuda, gpu, laptop, linux, memory, nbd
FlashInfer: Kernel Library for LLM Serving
GitHub repository with 5,752 stars and 1,026 forks.
Trending score: 1.16; stars gained: +15; forks gained: +8.
Language: Python
Topics: attention, cuda, distributed-inference, gpu, jit, large-large-models