LocalLLaMA

2640 readers

4 users here now

Community to discuss about LLaMA, the large language model created by Meta AI.

This is intended to be a replacement for r/LocalLLaMA on Reddit.

founded 2 years ago

MODERATORS

[email protected]

How much gpu do i need to run a 90b model (lemm.ee)

submitted 1 month ago by [email protected] to c/[email protected]

16 comments fedilink hide all child comments

Do i need industry grade gpu's or can i scrape by getring decent tps with a consumer level gpu.

you are viewing a single comment's thread
view the rest of the comments

[–] fhein 2 points 1 month ago

You have to specify which quantization you find acceptable, and which context size you require. I think the most affordable option to run large models locally is still getting multiple RTX3090 cards, and I guess you probably need 3 or 4 of those depending on quantization and context.

permalink
fedilink
source