Futurology

2150 readers

78 users here now

founded 2 years ago

MODERATORS

submitted 3 months ago by [email protected] to c/[email protected]

27 comments fedilink hide all child comments

you are viewing a single comment's thread
view the rest of the comments

[–] pennomi 3 points 3 months ago

It depends. A lot of LLMs are memory-constrained. If you’re constantly thrashing the GPU memory it can be both slower and less efficient.