this post was submitted on 28 Feb 2024
39 points (97.6% liked)
LocalLLaMA
2415 readers
47 users here now
Community to discuss about LLaMA, the large language model created by Meta AI.
This is intended to be a replacement for r/LocalLLaMA on Reddit.
founded 2 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
As far as I understand, their contribution is to apply what has proven to work well in the Llama architecture, to what BitNet does. And add a '0'. Maybe you just don't need that much text to explain it, just the statistics.
They claim it scales as a FP16 Llama model does... So unless their judgement/maths is wrong, it should hold up. I can't comment on that. But I'd like that if it were true...