LocalLLaMA

2218 readers

1 users here now

Community to discuss about LLaMA, the large language model created by Meta AI.

This is intended to be a replacement for r/LocalLLaMA on Reddit.

founded 1 year ago

MODERATORS

[email protected]

llama.cpp for GPU only (lemmy.ml)

submitted 1 year ago by [email protected] to c/[email protected]

8 comments fedilink hide all child comments

I’ve been using llama.cpp, gpt-llama and chatbot-ui for a while now, and I’m very happy with it. However, I’m now looking into a more stable setup using only GPU. Is this llama.cpp still still a good candidate for that?

you are viewing a single comment's thread
view the rest of the comments

[–] [email protected] 3 points 1 year ago (4 children)

GPTQ-for-llama with ooba booga works pretty well. I’m not sure to what extent it uses CPU, but my GPU is at 100% during inference so it seems to be mainly that.

[–] [email protected] 1 points 1 year ago (3 children)

I've looked at that before. Do you use it with any UI?

[–] [email protected] 1 points 1 year ago

Personally, I have nothing but issues with Oogas ui, so I connect Silly Tavern to it or KoboldCPP. Works great

load more comments (2 replies)