llm-d/llm-d-inference-sim
A lightweight, configurable, and real-time simulator designed to mimic the behavior of vLLM without the need for GPUs or running actual heavy models.
GitHub repository with 143 stars and 96 forks.
Language: Go
Topics: incubating