kubeai-project/kubeai
AI Inference Operator for Kubernetes. The easiest way to serve ML models in production. Supports VLMs, LLMs, embeddings, and speech-to-text.
GitHub repository with 1,210 stars and 126 forks.
Language: Go
Topics: ai, autoscaler, faster-whisper, inference-operator, k8s, kubernetes, llm, ollama, ollama-operator, openai-api