this post was submitted on 25 Jul 2024
641 points (99.1% liked)
196
16514 readers
3502 users here now
Be sure to follow the rule before you head out.
Rule: You must post before you leave.
founded 1 year ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
Yes, but 200 gb is probably already with 4 bit quantization, the weights in fp16 would be more like 800 gb IDK if its even possible to quantize more, if it is, you're probably better of going with a smaller model anyways