Technology
This is the official technology community of Lemmy.ml for all news related to creation and use of technology, and to facilitate civil, meaningful discussion around it.
Ask in DM before posting product reviews or ads. All such posts otherwise are subject to removal.
Rules:
1: All Lemmy rules apply
2: Do not post low effort posts
3: NEVER post naziped*gore stuff
4: Always post article URLs or their archived version URLs as sources, NOT screenshots. Help the blind users.
5: personal rants of Big Tech CEOs like Elon Musk are unwelcome (does not include posts about their companies affecting wide range of people)
6: no advertisement posts unless verified as legitimate and non-exploitative/non-consumerist
7: crypto related posts, unless essential, are disallowed
view the rest of the comments
It’s funny how he’s playing this out to be about third party apps like Apollo. Like yeah, that’s what the community cares about, but the reason they’re making the changes is because he’s fucking anal about OpenAI and other companies finding such success with products they have built using data scraped via the Reddit API.
He wants some of that money, not the comparatively tiny amount that Christian got from Apollo.
He also doesn’t seem to get that people root for an underdog. Had he been more serious about how they are upset that companies use their API to build massive tools that they can sublicense to other companies, like Microsoft, and make lots of money, people might agree with that.
What he’s framing it as though, is a big company like Reddit vs small indie app developers, like Christian Selig. Guess who the underdog is in that scenario, Hm?
Dude could literally invent a developer program to help support “sanctioned” third party devs that pay some sort of a yearly fee to access the API and raise cost like he is now to fend off LLMs. But nah, I’d expect that out of somebody that is actually wanting to solve the problem. Lol
That sounds unnecessarily complex. Just force an authentication of the client (ergo, make it so you can't access the API without logging in) and add api rate limits per user, maybe with higher limits on users that have the paid Reddit membership tier.
But I don't think that was the point anyway. It's less work to just start charging for the API. That way they can charge companies like OpenAI, and drive others to use their main app, letting them sell targeted adverts to them too.
For the sake of poking on a solution further. Auth would limit web scrapers, which they don’t want given how valuable user comments and posts were. Rate limiting can cause perf issues depending on how calls are being made and you’d have to make considerations based on metrics of usage, clients, calls per client etc etc, which is even more complex than full blown access to a “managed/sanctioned” client. A sanctioning system gives them full control of the pipeline, with the trade off being that it’s a bit more work on their end to vet them.
But yeah, clearly a solvable problem, but it’s just malice at this point on their part.
That's not the point for Reddit. They need to show a path to profitability for investors on the Reddit planned IPO. They plan on harvesting every last ounce of user data, and those third party apps deny them that every last ounce of data. That is why they won't back down.
I commented another place in the thread here about how Huffman is bsing about this “profitability” angle and I can’t link to it. :-( Path to profitability for reddit is a pre-seed 1 funding conversation that should have already been iterated on and solved by now, especially if they are going to IPO. Even if they were to harvest user data, that can be done with transparency, but they just don’t want to do it.
They kinda already have something along those lines, or at least it's in the works. I'm pretty sure that's what Devvit is supposed to be, but rather than actually finish that project, they'd rather crusade against the Apollo app for some reason
The data could just be scraped without the API anyway.
Absolutely, but the API offers a really smooth and convenient way of doing it without a lot of extra processing overhead. Scraping HTML is a little bit more involved.
But using an API requires integration with every individual site they want to consume. Crawlers do not. For the same reason, LLMs aren't using the API.
Reddit could also enforce existing limits or change their TOS to explicitly ban this activity of it was indeed leading to millions of dollars in additional operating expenses. They have done neither.
Huffman is just lying about OpenAI and others being the problem.