LLaMA Now Goes Faster on CPUs

Open link in next tab

LLaMA Now Goes Faster on CPUs

https://justine.lol/matmul/

I wrote 84 new matmul kernels to improve llamafile CPU performance.

LLaMA Now Goes Faster on CPUs

My kernels go 2x faster than MKL for matrices that fit in L2 cache, which makes them a work in progress, since the speedup works best for prompts having fewer than 1,000 tokens.