zengxiao-he/tessera

From teacher to tiles — a from-scratch LLM distillation & serving engine: custom Triton/CUDA kernels, FSDP distillation, paged-KV continuous batching, speculative decoding, a Rust gateway, a JAX oracle, and interpretability tooling.

GitHub repository with 181 stars and 1 forks.

Language: Python

Topics: cuda, flash-attention, fsdp, inference-engine, jax, knowledge-distillation, kv-cache, llm, mechanistic-interpretability, ml-systems

Open provider repository

24h trend summary

Trending score 2.19, freshness score 0.00, stars gained +11, forks gained +0.

Latest metric snapshot

2026-06-15: 181 stars and 1 forks.

Similar repositories

1. LMCache/LMCache

LMCache: Supercharge Your LLM with the Fastest KV Cache Layer

GitHub repository with 9,093 stars and 1,322 forks.

Trending score: 4.67; stars gained: +411; forks gained: +26.

Language: Python

Topics: amd, cuda, fast, inference, kv-cache, llm
2. vllm-project/vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

GitHub repository with 82,932 stars and 18,083 forks.

Trending score: 4.18; stars gained: +80; forks gained: +23.

Language: Python

Topics: amd, blackwell, cuda, deepseek, deepseek-v3, gpt
3. sgl-project/sglang

SGLang is a high-performance serving framework for large language models and multimodal models.

GitHub repository with 29,041 stars and 6,540 forks.

Trending score: 3.37; stars gained: +33; forks gained: +15.

Language: Python

Topics: attention, blackwell, cuda, deepseek, diffusion, glm
4. NVIDIA/TensorRT-LLM

TensorRT LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and supports state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT LLM also contains components to create Python and C++ runtimes that orchestrate the inference execution in a performant way.

GitHub repository with 13,881 stars and 2,465 forks.

Trending score: 2.24; stars gained: +7; forks gained: +2.

Language: Python

Topics: blackwell, cuda, llm-serving, moe, pytorch
5. roflcoopter/viseron

Self-hosted, local only NVR and AI Computer Vision software. With features such as object detection, motion detection, face recognition and more, it gives you the power to keep an eye on your home, office or any other place you want to monitor.

GitHub repository with 3,231 stars and 396 forks.

Trending score: 2.21; stars gained: +22; forks gained: +0.

Language: Python

Topics: nvr, network-video-capture, network-video-recorder, tensorflow, darknet, yolo
6. zengxiao-he/tessera

From teacher to tiles — a from-scratch LLM distillation & serving engine: custom Triton/CUDA kernels, FSDP distillation, paged-KV continuous batching, speculative decoding, a Rust gateway, a JAX oracle, and interpretability tooling.

GitHub repository with 181 stars and 1 forks.

Trending score: 2.19; stars gained: +11; forks gained: +0.

Language: Python

Topics: cuda, flash-attention, fsdp, inference-engine, jax, knowledge-distillation

Trending in Python

1. harry0703/MoneyPrinterTurbo

利用AI大模型，一键生成高清短视频 Generate short videos with one click using AI LLM.

GitHub repository with 88,031 stars and 12,625 forks.

Trending score: 6.02; stars gained: +1,097; forks gained: +218.

Language: Python

Topics: ai, automation, chatgpt, moviepy, python, shortvideo
2. pewdiepie-archdaemon/odysseus

Self-hosted AI workspace.

GitHub repository with 71,467 stars and 9,114 forks.

Trending score: 5.98; stars gained: +834; forks gained: +140.

Language: Python
3. NousResearch/hermes-agent

The agent that grows with you

GitHub repository with 194,134 stars and 33,994 forks.

Trending score: 5.92; stars gained: +753; forks gained: +209.

Language: Python

Topics: ai, ai-agent, ai-agents, anthropic, chatgpt, claude
4. NVIDIA/SkillSpector

Security scanner for AI agent skills. Detect vulnerabilities, malicious patterns, and security risks.

GitHub repository with 5,962 stars and 441 forks.

Trending score: 5.61; stars gained: +874; forks gained: +76.

Language: Python
5. rohitg00/ai-engineering-from-scratch

Learn it. Build it. Ship it for others.

GitHub repository with 32,676 stars and 5,366 forks.

Trending score: 5.59; stars gained: +762; forks gained: +135.

Language: Python

Topics: agents, ai, ai-agents, ai-engineering, computer-vision, course
6. Agents365-ai/drawio-skill

Generate draw.io diagrams from natural language — 6 presets, vision self-check + up to 5-round refinement, codebase-to-diagram, 10,000+ official shapes & 321 AI/LLM brand logos. Exports PNG/SVG/PDF/JPG.

GitHub repository with 3,445 stars and 240 forks.

Trending score: 5.51; stars gained: +1,369; forks gained: +113.

Language: Python

Topics: agent-skill, agent-skills, architecture-diagram, claude-code, claude-code-skill, claude-skills

zengxiao-he/tessera

24h trend summary

Latest metric snapshot

Similar repositories

1. LMCache/LMCache

2. vllm-project/vllm

3. sgl-project/sglang

4. NVIDIA/TensorRT-LLM

5. roflcoopter/viseron

6. zengxiao-he/tessera

Trending in Python

1. harry0703/MoneyPrinterTurbo

2. pewdiepie-archdaemon/odysseus

3. NousResearch/hermes-agent

4. NVIDIA/SkillSpector

5. rohitg00/ai-engineering-from-scratch

6. Agents365-ai/drawio-skill

Trending topic: cuda

1. LMCache/LMCache

2. vllm-project/vllm

3. sgl-project/sglang

4. Luce-Org/lucebox-hub

5. NVlabs/cuda-oxide

6. NVIDIA/TensorRT-LLM