this post was submitted on 14 Feb 2025
971 points (98.7% liked)

Technology

62837 readers
4492 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related content.
  3. Be excellent to each other!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, to ask if your bot can be added please contact us.
  9. Check for duplicates before posting, duplicates may be removed
  10. Accounts 7 days and younger will have their posts automatically removed.

Approved Bots


founded 2 years ago
MODERATORS
 

Reddit is planning to introduce a paywall this year, CEO Steve Huffman said during a videotaped Ask Me Anything (AMA) session on Thursday.

Huffman previously showed interest in potentially introducing a new type of subreddit with "exclusive content or private areas" that Reddit users would pay to access.

When asked this week about plans for some Redditors to create "content that only paid members can see," Huffman said:

It’s a work in progress right now, so that one’s coming... We're working on it as we speak.

When asked about "new, key features that you plan to roll out for Reddit in 2025," Huffman responded, in part: “Paid subreddits, yes.”

Reddit's paywall would ostensibly only apply to certain new subreddit types, not any subreddits currently available.

Reddit executives also discussed how they might introduce more ads into the social media platform. The push for ads follows changes to Reddit’s API policy that, in part, led to the closing of most third-party apps used for accessing Reddit. Reddit makes most of its revenue from ads and can only show ads on its native apps and website.

Reddit started testing ads in comments last year, with COO Jen Wong saying during an AMA that such ads are in “about 3 percent of inventory.” The executive hinted at that percentage growing. Wong also shared hopes that contextual advertising, or ads being shown based on the content surrounding them, will be a “bigger part of” Reddit’s business by 2026.

you are viewing a single comment's thread
view the rest of the comments
[–] njordomir 24 points 4 days ago (3 children)

Is there anywhere I can find a complete scrape of Reddit threads and comments from before the 3rd party app apocalypse? There was a lot of useful info shared on there, but I don't want anything to do with what that site has become. I'm happy just to CTRL+F a big dataset. It'll probably still work better than either Reddit or Google does nowadays. Without media I imagine I could fit it somewhere.

Also, Spez is a greedy little pig boy.

[–] MunkysUnkEnz0 7 points 4 days ago* (last edited 4 days ago)

Yes, there's a torrent somewhere...

https://academictorrents.com/details/c398a571976c78d346c325bd75c47b82edf6124e

This is what I could find with a quick search, but I know there's a larger database backup.

[–] Bonskreeskreeskree 2 points 3 days ago

We really need efforts made to bulk upload historical posts of value to lemmy. If done right, we could significantly expand the amount of subs and content, even if they are ghost towns initially with just the old posts from reddit. Build it and they will migrate.

[–] [email protected] 1 points 3 days ago (2 children)

are you going to use it to train your deepseek?

[–] njordomir 2 points 3 days ago

I never understood the desire to search in conversational language via AI. It's gone to far for my taste. I just want to be able to scour a huge volume of info for my exact search terms, maybe with a few synonyms or misspellings included. Google and AI keep trying to assume they know what I'm looking for, but they're always wrong (intentionally wrong based on their own motives).

The reason the dataset interests me is that search has gotten so bad that I can't get any non-corporate information from search engines anymore, just more pig swill, chumbucket ads, and misinformation slop. Anything I search for would probably give better results if I just searched old reddit, Wikipedia, and a few other datasets locally in a simple way. Not sure what software is best to use for something like that, but I'd like to collect a few mostly pre-AI datasets now to get the ball rolling before you can't find those online anymore either.

[–] [email protected] 3 points 3 days ago

Not everyone is perverted like you.