thu-pacman/chitu
High-performance inference framework for large language models, focusing on efficiency, flexibility, and availability.
GitHub repository with 3,121 stars and 266 forks.
Language: Python
Topics: deepseek, gpu, llm, llm-serving, model-serving, pytorch