brontoguana/krasis
Krasis is a Hybrid LLM runtime which focuses on efficient running of larger models on consumer grade VRAM limited hardware
GitHub repository with 469 stars and 26 forks.
Language: C++
Topics: cpu-inference, gguf-model-support, gpu-inference, high-performance-inference, hybrid-inference, inference-engine, inference-optimization, large-language-models, llama-cpp-alternative, llm-inference