Mizistein/omlx
🤖 Optimize LLM inference on Mac with continuous batching and SSD caching managed from your menu bar for efficient performance.
GitHub repository with 8 stars and 0 forks.
Language: Python
Topics: apple-silicon, chatbot, inference-server, llm, macos, mlx, model-serving, openai-api