NVIDIA/TransformerEngine

A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit and 4-bit floating point (FP8 and FP4) precision on Hopper, Ada and Blackwell GPUs, to provide better performance with lower memory utilization in both training and inference.

GitHub repository with 3,379 stars and 739 forks.

Language: Python

Topics: cuda, deep-learning, gpu, machine-learning, python, pytorch, fp8, jax, fp4

Open provider repository

24h trend summary

Trending score 1.10, activity score 0.05, stars gained +1, forks gained +3.

Latest metric snapshot

2026-06-05: 3,379 stars and 739 forks.

Similar repositories

1. vllm-project/vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

GitHub repository with 81,990 stars and 17,671 forks.

Trending score: 3.75; stars gained: +79; forks gained: +46.

Language: Python

Topics: amd, blackwell, cuda, deepseek, deepseek-v3, gpt
2. gpustack/gpustack

A GPU cluster manager that configures and orchestrates inference engines like vLLM and SGLang for high-performance AI model deployment.

GitHub repository with 5,106 stars and 542 forks.

Trending score: 2.51; stars gained: +11; forks gained: +1.

Language: Python

Topics: ascend, cuda, deepseek, distributed-inference, genai, high-performance-inference
3. LMCache/LMCache

LMCache: Supercharge Your LLM with the Fastest KV Cache Layer

GitHub repository with 8,422 stars and 1,246 forks.

Trending score: 2.17; stars gained: +11; forks gained: +6.

Language: Python

Topics: amd, cuda, fast, inference, kv-cache, llm
4. sgl-project/sglang

SGLang is a high-performance serving framework for large language models and multimodal models.

GitHub repository with 28,882 stars and 6,345 forks.

Trending score: 1.72; stars gained: -55; forks gained: +18.

Language: Python

Topics: attention, blackwell, cuda, deepseek, diffusion, glm
5. NVIDIA/TensorRT-LLM

TensorRT LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and supports state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT LLM also contains components to create Python and C++ runtimes that orchestrate the inference execution in a performant way.

GitHub repository with 13,807 stars and 2,440 forks.

Trending score: 1.18; stars gained: +16; forks gained: +7.

Language: Python

Topics: blackwell, cuda, llm-serving, moe, pytorch
6. flashinfer-ai/flashinfer

FlashInfer: Kernel Library for LLM Serving

GitHub repository with 5,748 stars and 1,026 forks.

Trending score: 1.16; stars gained: +15; forks gained: +8.

Language: Python

Topics: gpu, large-large-models, cuda, pytorch, llm-inference, jit

Trending in Python

1. NousResearch/hermes-agent

The agent that grows with you

GitHub repository with 181,649 stars and 31,166 forks.

Trending score: 5.95; stars gained: +1,867; forks gained: +361.

Language: Python

Topics: ai, ai-agent, ai-agents, anthropic, chatgpt, claude
2. chopratejas/headroom

Compress tool outputs, logs, files, and RAG chunks before they reach the LLM. 60-95% fewer tokens, same answers. Library, proxy, MCP server.

GitHub repository with 13,361 stars and 853 forks.

Trending score: 5.69; stars gained: +2,829; forks gained: +175.

Language: Python

Topics: agent, ai, anthropic, compression, context-engineering, context-window
3. Imbad0202/academic-research-skills

Academic Research Skills for Claude Code: research → write → review → revise → finalize

GitHub repository with 27,422 stars and 2,253 forks.

Trending score: 5.52; stars gained: +1,079; forks gained: +89.

Language: Python

Topics: academic-pipeline, academic-writing, ai-research, claude, claude-code, literature-review
4. anthropics/financial-services

GitHub repository with 30,029 stars and 4,231 forks.

Trending score: 4.88; stars gained: +688; forks gained: +114.

Language: Python
5. virgiliojr94/book-to-skill

Turn any technical book PDF into a Claude Code skill — ready to study, reference, and use while you work.

GitHub repository with 4,250 stars and 534 forks.

Trending score: 4.88; stars gained: +476; forks gained: +68.

Language: Python
6. vinta/awesome-python

An opinionated list of Python frameworks, libraries, tools, and resources

GitHub repository with 301,371 stars and 28,044 forks.

Trending score: 4.60; stars gained: +518; forks gained: +24.

Language: Python

Topics: awesome, python, collections, python-frameworks, python-libraries, python-tools

NVIDIA/TransformerEngine

24h trend summary

Latest metric snapshot

Similar repositories

1. vllm-project/vllm

2. gpustack/gpustack

3. LMCache/LMCache

4. sgl-project/sglang

5. NVIDIA/TensorRT-LLM

6. flashinfer-ai/flashinfer

Trending in Python

1. NousResearch/hermes-agent

2. chopratejas/headroom

3. Imbad0202/academic-research-skills

4. anthropics/financial-services

5. virgiliojr94/book-to-skill

6. vinta/awesome-python

Trending topic: cuda

1. vllm-project/vllm

2. gpustack/gpustack

3. Luce-Org/lucebox-hub

4. LMCache/LMCache

5. tenstorrent/tt-metal

6. AmmarkoV/SAM3DBody-cpp