Scottcjn/ram-coffers
NUMA-distributed weight banking for LLM inference on IBM POWER8. 147 t/s (8.8x stock). Part of the Proof of Physical AI stack.
GitHub repository with 147 stars and 32 forks.
Language: C
Topics: llama-cpp, llm, numa, power8, ai-inference, depin, hebbian, powerpc, proof-of-physical-ai, vec-perm