this post was submitted on 01 Dec 2024
46 points (88.3% liked)

Futurology

1840 readers
358 users here now

founded 1 year ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
[–] pennomi 3 points 1 week ago

It depends. A lot of LLMs are memory-constrained. If you’re constantly thrashing the GPU memory it can be both slower and less efficient.