edaywalid/llm-inference-gateway
OpenAI-compatible LLM gateway: routes to the cheapest model meeting latency/quality targets, with SSE streaming, semantic caching, per-key rate limiting, and cost tracking.
GitHub repository with 7 stars and 0 forks.
Language: Python
Topics: api-gateway, fastapi, gemini, groq, llm, llmops, openai, postgres, rate-limiting, redis