this post was submitted on 10 Jan 2025
12 points (92.9% liked)

LocalLLaMA

2496 readers
27 users here now

Community to discuss about LLaMA, the large language model created by Meta AI.

This is intended to be a replacement for r/LocalLLaMA on Reddit.

founded 2 years ago
MODERATORS
 

Do i need industry grade gpu's or can i scrape by getring decent tps with a consumer level gpu.

you are viewing a single comment's thread
view the rest of the comments
[–] Sylovik 4 points 3 weeks ago (3 children)

In case of LLM's you should look at AirLLM. I suppose there is no conviniet integrations to local chat tools, but issue at Ollama already started.

[–] [email protected] 1 points 3 weeks ago (1 children)

That looks like exactly the sort of thing i want. Any existing solution to get it to behave like an ollama instance (i have a bunch of services pointed at an ollama run on docker).

[–] Sylovik 2 points 3 weeks ago

You may try Harbor. The description claims to provide an OpenAI-compatible API.

load more comments (1 replies)