quotingIn 3 years, we will see LLM ASICs on a USB Stick.
nevent1q…9grn
This paper eliminates the need for costly matrix multiplication in LLMs claiming a 10x reduction of memory use during compute. If they can turn a 70b model into a 7b model, we are running these things on phones.
https://arxiv.org/abs/2406.02528
Daniel on Nostr: Who will be the Apple of personal AI? That's what I'm waiting for. ...
Who will be the Apple of personal AI? That's what I'm waiting for.