Regarding your question: 13B 2_K seems to be on par with 7B 16bit and 8bit. Not much of a difference between all those. (Look at the perplexity values. Lower is better.) The second link has a nice graph.

Most people don't go as low as 2bit though. Look at the graph, below 4bit things start to deteriorate.

[–] [email protected] 5 points 1 year ago

That graph is great. Very easy to understand. Thank you!

[–] [email protected] 2 points 1 year ago

These are good sources, to add one more, the GPTQ paper talks a lot about perplexity at several quantization and model sizes:

https://arxiv.org/abs/2210.17323

[–] [email protected] 2 points 1 year ago (1 children)

Anyone else see 11 comments on the post count but only 2 comments..?

[–] [email protected] 2 points 1 year ago (1 children)

Yes

[–] [email protected] 5 points 1 year ago (1 children)

Well, a few of those extra numbers are my fault. I edited my answer a few times. And lemmy reportedly counts every edit as an additional comment. (When user and community are on different instances.) I hope they fix that soon.

[–] [email protected] 2 points 1 year ago

ahh makes sense, i just made a post and deleted the comment i made on it but it glitched and deleted twice so now my post has -1 comments lmao

load more comments