thushan/olla
High-performance lightweight proxy and load balancer for LLM infrastructure. Intelligent routing, automatic failover and unified model discovery across local and remote inference backends.
GitHub repository with 236 stars and 31 forks.
Language: Go
Topics: ai, llm-inference, lmstudio, ollama, proxy, vllm, golang, llamacpp, llm-proxy, llm-router