Sakura66/sagesched
SageSched: Intelligent LLM Request Scheduler with Workload Prediction — QoS-aware dual-queue scheduling for black-box LLM APIs (OpenAI/Azure/Doubao/Gemini)
GitHub repository with 7 stars and 0 forks.
Language: Python
Topics: api-gateway, faiss, fastapi, gittins-index, llm, llm-inference, llm-proxy, load-balancer, openai, qos