jjang-ai/vmlx
vMLX - JANGTQ Uber Compressed MLX Models - L2 Disk Cache (survives restart) + L1 Paged (super fast ttft) + Hybrid SSM Scheduler + Cont Batching + etc!
GitHub repository with 615 stars and 68 forks.
Language: Python
Topics: anthropic-api, kvcache-compression, kvcache-optimization, kvcache-reuse, llm, lmstudio, macbook, mcp-server, mlx, mlxllm