rapidsai/cuml

cuML - RAPIDS Machine Learning Library

GitHub repository with 5,207 stars and 629 forks.

Language: Python

Topics: cuda, gpu, machine-learning, machine-learning-algorithms, nvidia

Open provider repository

Latest metric snapshot

2026-06-05: 5,207 stars and 629 forks.

Similar repositories

  1. 1. vllm-project/vllm

    A high-throughput and memory-efficient inference and serving engine for LLMs

    GitHub repository with 81,995 stars and 17,676 forks.

    Trending score: 3.75; stars gained: +79; forks gained: +46.

    Language: Python

    Topics: amd, blackwell, cuda, deepseek, deepseek-v3, gpt

  2. 2. gpustack/gpustack

    A GPU cluster manager that configures and orchestrates inference engines like vLLM and SGLang for high-performance AI model deployment.

    GitHub repository with 5,107 stars and 542 forks.

    Trending score: 2.51; stars gained: +11; forks gained: +1.

    Language: Python

    Topics: ascend, cuda, deepseek, distributed-inference, genai, high-performance-inference

  3. 3. LMCache/LMCache

    LMCache: Supercharge Your LLM with the Fastest KV Cache Layer

    GitHub repository with 8,422 stars and 1,246 forks.

    Trending score: 2.17; stars gained: +11; forks gained: +6.

    Language: Python

    Topics: amd, cuda, fast, inference, kv-cache, llm

  4. 4. sgl-project/sglang

    SGLang is a high-performance serving framework for large language models and multimodal models.

    GitHub repository with 28,859 stars and 6,348 forks.

    Trending score: 1.72; stars gained: -55; forks gained: +18.

    Language: Python

    Topics: attention, blackwell, cuda, deepseek, diffusion, glm

  5. 5. NVIDIA/TensorRT-LLM

    TensorRT LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and supports state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT LLM also contains components to create Python and C++ runtimes that orchestrate the inference execution in a performant way.

    GitHub repository with 13,807 stars and 2,440 forks.

    Trending score: 1.18; stars gained: +16; forks gained: +7.

    Language: Python

    Topics: blackwell, cuda, llm-serving, moe, pytorch

  6. 6. NVIDIA/TransformerEngine

    A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit and 4-bit floating point (FP8 and FP4) precision on Hopper, Ada and Blackwell GPUs, to provide better performance with lower memory utilization in both training and inference.

    GitHub repository with 3,380 stars and 739 forks.

    Trending score: 1.10; stars gained: +1; forks gained: +3.

    Language: Python

    Topics: cuda, deep-learning, gpu, machine-learning, python, pytorch

Trending in Python

  1. 1. NousResearch/hermes-agent

    The agent that grows with you

    GitHub repository with 181,898 stars and 31,209 forks.

    Trending score: 5.95; stars gained: +1,867; forks gained: +361.

    Language: Python

    Topics: ai, ai-agent, ai-agents, anthropic, chatgpt, claude

  2. 2. chopratejas/headroom

    Compress tool outputs, logs, files, and RAG chunks before they reach the LLM. 60-95% fewer tokens, same answers. Library, proxy, MCP server.

    GitHub repository with 13,768 stars and 870 forks.

    Trending score: 5.69; stars gained: +2,829; forks gained: +175.

    Language: Python

    Topics: agent, ai, anthropic, compression, context-engineering, context-window

  3. 3. Imbad0202/academic-research-skills

    Academic Research Skills for Claude Code: research → write → review → revise → finalize

    GitHub repository with 27,484 stars and 2,256 forks.

    Trending score: 5.52; stars gained: +1,079; forks gained: +89.

    Language: Python

    Topics: academic-pipeline, academic-writing, ai-research, claude, claude-code, literature-review

  4. 4. rohitg00/ai-engineering-from-scratch

    Learn it. Build it. Ship it for others.

    GitHub repository with 28,622 stars and 4,680 forks.

    Trending score: 5.32; stars gained: +1,261; forks gained: +238.

    Language: Python

    Topics: agents, ai, ai-agents, ai-engineering, computer-vision, course

  5. 5. anthropics/financial-services

    GitHub repository with 30,029 stars and 4,231 forks.

    Trending score: 4.88; stars gained: +688; forks gained: +114.

    Language: Python

  6. 6. vinta/awesome-python

    An opinionated list of Python frameworks, libraries, tools, and resources

    GitHub repository with 301,396 stars and 28,042 forks.

    Trending score: 4.60; stars gained: +518; forks gained: +24.

    Language: Python

    Topics: awesome, python, collections, python-frameworks, python-libraries, python-tools

Trending topic: cuda

  1. 1. vllm-project/vllm

    A high-throughput and memory-efficient inference and serving engine for LLMs

    GitHub repository with 81,995 stars and 17,676 forks.

    Trending score: 3.75; stars gained: +79; forks gained: +46.

    Language: Python

    Topics: amd, blackwell, cuda, deepseek, deepseek-v3, gpt

  2. 2. gpustack/gpustack

    A GPU cluster manager that configures and orchestrates inference engines like vLLM and SGLang for high-performance AI model deployment.

    GitHub repository with 5,107 stars and 542 forks.

    Trending score: 2.51; stars gained: +11; forks gained: +1.

    Language: Python

    Topics: ascend, cuda, deepseek, distributed-inference, genai, high-performance-inference

  3. 3. Luce-Org/lucebox-hub

    Fast LLM speculative inference server for consumer hardware.

    GitHub repository with 2,332 stars and 217 forks.

    Trending score: 2.31; stars gained: +17; forks gained: +3.

    Language: C++

    Topics: kernel, llama-cpp, local-ai, nvidia-cuda, qwen, rtx3090

  4. 4. LMCache/LMCache

    LMCache: Supercharge Your LLM with the Fastest KV Cache Layer

    GitHub repository with 8,422 stars and 1,246 forks.

    Trending score: 2.17; stars gained: +11; forks gained: +6.

    Language: Python

    Topics: amd, cuda, fast, inference, kv-cache, llm

  5. 5. shader-slang/slang

    Making it easier to work with shaders

    GitHub repository with 5,349 stars and 451 forks.

    Trending score: 2.08; stars gained: +4; forks gained: +2.

    Language: C++

    Topics: shaders, hlsl, glsl, d3d12, vulkan, cuda

  6. 6. tenstorrent/tt-metal

    :metal: TT-NN operator library, and TT-Metalium low level kernel programming model.

    GitHub repository with 1,494 stars and 480 forks.

    Trending score: 1.82; stars gained: +7; forks gained: +5.

    Language: C++

    Topics: accelerator, ai, cuda, deepseek, gpu, img-gen