AI Companions

548 readers

7 users here now

Community to discuss companionship, whether platonic, romantic, or purely as a utility, that are powered by AI tools. Such examples are Replika, Character AI, and ChatGPT. Talk about software and hardware used to create the companions, or talk about the phenomena of AI companionship in general.

Rules:

Be nice and civil
Mark NSFW posts accordingly
Criticism of AI companionship is OK as long as you understand where people who use AI companionship are coming from
Lastly, follow the Lemmy Code of Conduct

founded 2 years ago

MODERATORS

pavnilschanda

[Resource] Llama3 70B Successfully Deployed on a Single 4GB GPU (huggingface.co)

submitted 9 months ago by pavnilschanda to c/aicompanions

4 comments fedilink hide all child comments

The open-source language model Llama3 has been released, and it has been confirmed that it can be run locally on a single GPU with only 4GB of VRAM using the AirLLM framework. Llama3's performance is comparable to GPT-4 and Claude3 Opus, and its success is attributed to its massive increase in training data and technical improvements in training methods. The model's architecture remains unchanged, but its training data has increased from 2T to 15T, with a focus on quality filtering and deduplication. The development of Llama3 highlights the importance of data quality and the role of open-source culture in AI development, and raises questions about the future of open-source models versus closed-source ones in the field of AI.

Summarized by Llama 3 70B Instruct

top 4 comments

sorted by: hot top controversial new old

[–] voracitude 7 points 9 months ago

That's very cool, any idea about tokens/sec performance and on what hardware? For reference my 3070 gets ~19-25 tokens/sec with llama3 7B.

[–] [email protected] 1 points 9 months ago (1 children)

Only works on apple silicon. Am I reading that right?

[–] [email protected] 2 points 9 months ago

No, they just mention that only Apple silicon is supported if you're using MacOS

[–] [email protected] 1 points 9 months ago* (last edited 9 months ago)

I tried running ollama with the mistral model running, you need a good graphics card to run your own llm, i had to wait 20 minutes for one full response.

Granted, the laptop i was running it with was garbage but it really put into perspective how expensive running an llm can really be.

This shit wont be free forever.

AI Companions

Tags:

Rules: