14
What is your opinion of the Large Language Model (LLM) argument made by Reddit?
(self.experienced_devs)
A community for discussion amongst professional software developers.
Posts should be relevant to those well into their careers.
For those looking to break into the industry, are hustling for their first job, or have just started their career and are looking for advice, check out:
I do not want my content to contribute to propertiery LLM that will make billion for large tech company without giving back to the community. Unfortunately I think fediverse have a harder time countering large scale data harvesting than a centralized service like reddit.
On the other hand, I don't mind open source, privacy respecting (is this a thing for LLM?) LLM to use my content.
I am also wary of big tech companies using my comment history for their LLMs. However, I worry that the tech companies will scrape data anyway and Reddit's API pricing just locks out the open source LLMs. There are a few of them, a couple that I have played with:
https://github.com/nomic-ai/gpt4all
https://github.com/ggerganov/llama.cpp
Some projects even try to preserve privacy. But I think its more on the side of what extra training data you give it and the queries you issue.
https://github.com/imartinez/privateGPT