Xilinx/brevitas

Brevitas: neural network quantization in PyTorch

GitHub repository with 1,540 stars and 245 forks.

Language: Python

Topics: quantization, pytorch, brevitas, fpga, neural-networks, hardware-acceleration, xilinx, deep-learning, ptq, qat

Open provider repository

Latest metric snapshot

2026-06-13: 1,540 stars and 245 forks.

Similar repositories

  1. 1. hiyouga/LlamaFactory

    Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)

    GitHub repository with 72,166 stars and 8,829 forks.

    Trending score: 3.37; stars gained: +39; forks gained: +2.

    Language: Python

    Topics: agent, ai, deepseek, fine-tuning, gemma, gpt

  2. 2. zengxiao-he/tessera

    From teacher to tiles — a from-scratch LLM distillation & serving engine: custom Triton/CUDA kernels, FSDP distillation, paged-KV continuous batching, speculative decoding, a Rust gateway, a JAX oracle, and interpretability tooling.

    GitHub repository with 175 stars and 1 forks.

    Trending score: 2.19; stars gained: +11; forks gained: +0.

    Language: Python

    Topics: cuda, flash-attention, fsdp, inference-engine, jax, knowledge-distillation

  3. 3. Tencent/AngelSlim

    Model compression toolkit engineered for enhanced usability, comprehensiveness, and efficiency.

    GitHub repository with 1,308 stars and 150 forks.

    Trending score: 1.73; stars gained: +4; forks gained: +0.

    Language: Python

    Topics: audio, deepseek, dflash, diffusion, eagle, fp4

  4. 4. huawei-csl/KVarN

    KVarN is a native vLLM KV-cache quantization backend for your agents: 3-5x more context, throughput above FP16, and FP16-level accuracy. Calibration-free, one flag.

    GitHub repository with 397 stars and 22 forks.

    Trending score: 1.60; stars gained: +2; forks gained: +0.

    Language: Python

    Topics: agentic-ai, kv-cache, llm, llm-inference, long-context, quantization

  5. 5. intel/auto-round

    A SOTA quantization algorithm for high-accuracy low-bit LLM inference, seamlessly optimized for CPU/XPU/CUDA, with multi-datatype support and full compatibility with vLLM, SGLang, and Transformers.

    GitHub repository with 1,453 stars and 140 forks.

    Trending score: 1.55; stars gained: +3; forks gained: +0.

    Language: Python

    Topics: int4, quantization, rounding, transformers, vllm, mxfp4

  6. 6. wanshuiyin/ARIS-in-AI-Offer

    Bilingual (中文+EN) ML / LLM / diffusion / agent interview cheat sheets for AI 秋招 — generated by ARIS /interview-cheatsheet, rendered by /render-html into single-file HTML, reads anywhere — plus a CV→DBLP-fact-checked academic homepage generator and hand-authored long-form blogs 🌱

    GitHub repository with 207 stars and 8 forks.

    Trending score: 1.44; stars gained: +2; forks gained: +0.

    Language: Python

    Topics: ai-interview, aris, autumn-recruiting, cheatsheet, chinese, claude-code

Trending in Python

  1. 1. chopratejas/headroom

    Compress tool outputs, logs, files, and RAG chunks before they reach the LLM. 60-95% fewer tokens, same answers. Library, proxy, MCP server.

    GitHub repository with 27,902 stars and 1,891 forks.

    Trending score: 6.49; stars gained: +2,776; forks gained: +250.

    Language: Python

    Topics: agent, ai, anthropic, claude-code, compression, context-engineering

  2. 2. harry0703/MoneyPrinterTurbo

    利用AI大模型,一键生成高清短视频 Generate short videos with one click using AI LLM.

    GitHub repository with 87,926 stars and 12,612 forks.

    Trending score: 6.02; stars gained: +1,097; forks gained: +218.

    Language: Python

    Topics: ai, automation, chatgpt, moviepy, python, shortvideo

  3. 3. pewdiepie-archdaemon/odysseus

    Self-hosted AI workspace.

    GitHub repository with 71,235 stars and 9,075 forks.

    Trending score: 5.98; stars gained: +834; forks gained: +140.

    Language: Python

  4. 4. NousResearch/hermes-agent

    The agent that grows with you

    GitHub repository with 193,818 stars and 33,911 forks.

    Trending score: 5.92; stars gained: +753; forks gained: +209.

    Language: Python

    Topics: ai, ai-agent, ai-agents, anthropic, chatgpt, claude

  5. 5. NVIDIA/SkillSpector

    Security scanner for AI agent skills. Detect vulnerabilities, malicious patterns, and security risks.

    GitHub repository with 5,654 stars and 427 forks.

    Trending score: 5.61; stars gained: +874; forks gained: +76.

    Language: Python

  6. 6. rohitg00/ai-engineering-from-scratch

    Learn it. Build it. Ship it for others.

    GitHub repository with 32,527 stars and 5,342 forks.

    Trending score: 5.59; stars gained: +762; forks gained: +135.

    Language: Python

    Topics: agents, ai, ai-agents, ai-engineering, computer-vision, course

Trending topic: quantization

  1. 1. hiyouga/LlamaFactory

    Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)

    GitHub repository with 72,166 stars and 8,829 forks.

    Trending score: 3.37; stars gained: +39; forks gained: +2.

    Language: Python

    Topics: agent, ai, deepseek, fine-tuning, gemma, gpt

  2. 2. amitshekhariitbhu/ai-engineering-interview-questions

    Your Cheat Sheet for AI Engineering Interview – Questions and Answers.

    GitHub repository with 1,861 stars and 339 forks.

    Trending score: 3.06; stars gained: +32; forks gained: +5.

    Language: Markdown

    Topics: agents, ai, ai-agents, ai-engineering, interview, interview-preparation

  3. 3. zengxiao-he/tessera

    From teacher to tiles — a from-scratch LLM distillation & serving engine: custom Triton/CUDA kernels, FSDP distillation, paged-KV continuous batching, speculative decoding, a Rust gateway, a JAX oracle, and interpretability tooling.

    GitHub repository with 175 stars and 1 forks.

    Trending score: 2.19; stars gained: +11; forks gained: +0.

    Language: Python

    Topics: cuda, flash-attention, fsdp, inference-engine, jax, knowledge-distillation

  4. 4. timtoole02/Camelid

    Camelid: a Rust-native local inference backend with evidence-gated model compatibility.

    GitHub repository with 79 stars and 10 forks.

    Trending score: 1.95; stars gained: +14; forks gained: +0.

    Language: Rust

    Topics: apple-silicon, gguf, inference, llama, llm, local-first

  5. 5. Tencent/AngelSlim

    Model compression toolkit engineered for enhanced usability, comprehensiveness, and efficiency.

    GitHub repository with 1,308 stars and 150 forks.

    Trending score: 1.73; stars gained: +4; forks gained: +0.

    Language: Python

    Topics: audio, deepseek, dflash, diffusion, eagle, fp4

  6. 6. huawei-csl/KVarN

    KVarN is a native vLLM KV-cache quantization backend for your agents: 3-5x more context, throughput above FP16, and FP16-level accuracy. Calibration-free, one flag.

    GitHub repository with 397 stars and 22 forks.

    Trending score: 1.60; stars gained: +2; forks gained: +0.

    Language: Python

    Topics: agentic-ai, kv-cache, llm, llm-inference, long-context, quantization