Please don't do this.
Asklemmy
A loosely moderated place to ask open-ended questions
Search asklemmy π
If your post meets the following criteria, it's welcome here!
- Open-ended question
- Not offensive: at this point, we do not have the bandwidth to moderate overtly political discussions. Assume best intent and be excellent to each other.
- Not regarding using or support for Lemmy: context, see the list of support communities and tools for finding communities below
- Not ad nauseam inducing: please make sure it is a question that would be new to most members
- An actual topic of discussion
Looking for support?
Looking for a community?
- Lemmyverse: community search
- sub.rehab: maps old subreddits to fediverse options, marks official as such
- [email protected]: a community for finding communities
~Icon~ ~by~ ~@Double_[email protected]~
I think the community is much more important than just having more content. I would worry that by flooding Lemmy with Reddit's content without the community to support that content could drown everyone out.
Yeah I agree. I think being able to tweak a personalised scraper as an opt in service per community could work. Some subs just won't work for this, like askreddit and Eli5, but niche communities, news communities and information communities. I would be happy just seeing the top 2 or 3 posts of the day for some, and for world news for example, only the posts that pass a certain threshold of upvotes in a certain time or against the subs size, to make sure the real news pops up quick.
If I wanted to lurk Reddit, I would just do that. Better to stay your separate thing. It is nice to sometimes get news about the other side of the fence.
Nah. This is a fresh start. It's been less than a week and theres already so much more content. It'll grow soon enough. Especially with spez fucking around over there. We want og content here, not all the shit reposts that already plague reddit
Time to get to work.
Cracks knuckles
Everyday the experience gets better. It's great to see, and I'm glad I'm here for it. I've never known social media to bring me actual joy.
I think itβs like breaking up with someone and then dedicating yourself to building a weird, soulless android version of your ex.
If you wanna write code to do this ... I'd say skip the bot, write a gateway instead.
Back in the early days of email, there were lots of different email systems, not just the SMTP Internet email we use today. There was UUCP email with "bang paths", where your email address specified a list of servers that a message could be passed through to get to you. There were other networks like FidoNet and WWIVnet, that could send email to Internet email addresses through special "gateway" servers.
A gateway receives messages using one protocol or service, and retransmits or makes them available on another protocol or service.
For a little while in 1992, I had access to read Usenet posts only through a gateway that exported Usenet posts onto the Gopher system.
A gateway between Reddit and Lemmy would appear to Reddit as a web browser, scraping posts and comments; while appearing to Lemmy as a Lemmy instance that users could subscribe to, making each subreddit it scrapes available as a Lemmy community.
So a Lemmy user could subscribe to, say, [email protected] and see a fresh view of AskReddit. The server at reddittolemmy.com would not be a standard Lemmy server with users, but rather a custom gateway server that fetches data from Reddit and makes it available in the form of a Lemmy community.
(If Reddit were not being an asshole, a gateway could be an API client. But Reddit is being an asshole, so a gateway should probably be written as a scraper that accesses Reddit as if it were a normal user using a desktop Web browser.)
This is a great idea.
I don't particularly think the whole of Reddit needs to be scraped though. I could be happy with only scraping posts that pass a certain thresh hold of votes against the subreddits subscriber count and maybe getting those crossposted to the Lemmy equivalent communities that want to opt in to such a service. This would be especially useful for World News and the more niche subreddits that don't yet have a big enough userbase here
The hard part isn't describing which posts or comments need to be gatewayed.
The hard part is being able to deliver posts and comments across the gateway at all.
You could dedicate a community to reddit reposts easy enough. If people want to see the stuff from reddit they can sub, if they would rather wash their hands of reddit they can ignore it or block it.
eh, maybe wait until 0.18 rolls out across the fediverse before scraping reddit for content
Oh? What's that going to do? I'm out of the loop.
0.17.4 is what is used now and it has a number of issues - included the new/scrolling bug. 0.18 is supposed to fix that & other issues as well.
I would rather have content posted by humans. Lemmy doesn't need to be reddit anyway.
Thoughts: Content is good atm. If it's spammy it'll get blocked. Or the instance de-federated. Is going to Reddit boosting reddit or draining it by using the content. Reddit has a lot of bad content that Lemmy is free of.
But I also liked the...(whispers) cosplay...
In general, I really do not like this idea. Lemmy is Lemmy and should not, directly or indirectly, turn into Reddit.
Use what fediverse is good at. Make an instance where these bots reside and anyone who doesn't want them can just defederate.
I mean, any bot you spin up is going to go away at the end of the month when they kill API access, right?
You can always scrape, or use your cookies. There are ways around this for one off bots. Not that I'm an advocate.
Good question! I dont really know how I'd feel about this.
On the one hand, for me to be ok with it it needs to be CLEAR that its a bot reposting reddit content, maybe even if its limited to specific communities for the sake of archiving.
On the other, I do want lemmy to be distinct and I am curious to see where it will end up. I also feel like we shouldn't be trying to emulate reddit exactly, as that would mean we also get the crappy parts as well. Also, that would be limiting. Who knows what lemmy can become? Why limit it to being reddit 2.0, you know?
I think reposting content from reddit is fine, but we don't need a bot for it. I'd rather see individuals bring over specific posts they think are notable as opposed to automatically copying everything they've got.
I say keep it distinct.
Reddit has had its day.
I think there's value to it. There's a ton of really important and helpful documentation on there for all kinds of fields, in the past couple days I've noticed some references I've had for tech issues and guitar repair are no longer accessible, for example. There's going to be a period where looking for help/answers online for certain issues is going to be a nightmare.
Yes, content is important, but it's not about the amount of content, but the quality of the content and discussion. Imagine every day you come to Lemmy and your front page is flooded with 0 upvote posts with no discussion from Reddit. You'd probably leave right away.
Manually cross-posting content that you like is probably better, since you're more likely to choose high-quality relevant content, and (I'm assuming) you'd be rate-limited to an appropriate amount.
Some people set things up like this on Mastodon, bots that reposted things from Twitter. Personally, I don't love them.
When everyone left Digg and got on Reddit back in the day, no one set up bots to automatically repost things from Digg to Reddit... it wouldn't even have been helpful, because people just stopped posting nearly as much stuff on Digg.
I don't like those bots either and make it a point to not follow them. Id say if the OP can't see the people trying to interact with the post, what's the point?
I am a veteran of the Digg migration, the world is very very different then it was back then. For one thing, Reddit was already an established alternative, just with a lower user base then Digg. Lemmy feels much more like building a community completely from scratch.
I've been beating this drum since I got on here.
Here's the software you would need to put it in a special instance: https://github.com/rileynull/RedditLemmyImporter
Nah. If you see something interesting just post the direct link to it instead of going through reddit. Lemmy is a content aggregator after all.
I think it could be great for some subs, that are very dependent on many users, because It's the rare users that provide content. Subs like AITA or TIFU, where you can react and discuss inside the community, but get content from outside.
Ideally tho, they would just become self sustainable
It'd be kind of nice to play catch up a bit with the, what, 18 years of content on Reddit
I know it might feel soulless, but having a constant stream of "pre-approved" content isn't the worst idea
Bots in general suck.
But if you as a human want to sift and repost reddit content, please do.
It'd be a good idea to create an instance with different communities as someone said here for the kind of posts that people could contribute, no matter if reposts are made by people or by bots (in this case, posts need to be filtered by upvotes or upvotes ratio or manually selected), for example some posts I was referring to with my first comment were about some guides or wikis for certain apps, etc. Some important or interesting knowledge that won't be here and that would make people like me forced to rely on Reddit.
That's why I said just get the content of those posts and just give credits to the OP without just copying a link to redirect people to Reddit.
In fact, I wouldn't want this to be filled with Reddit spam posts as most of them are useless.
As Reddit turns into a trash, the moderation and content quality will drop. Import the content may seem interesting at first, but in the long run it won't worth the effort.
I think many would be more interested in a migration tool. A way of porting a subreddit's worth of content to lemmy and start off strong, as well as preserve what might be several subreddits getting nuked as damage control.
A few comments, some of which have been touched on by others:
-
One of the reasons Reddit went to the new pricing was to prevent LLMs from scraping their content and using it for free. I don't think a bot like this would work for the same reason.
-
I personally don't want a duplicate of Reddit. I'd prefer this to be a new community that can do better.
-
One of the things that drove me away from Reddit (besides their horrible handling of 3PAs) was all the bot posts. If there were a way to make sure that bot accounts were identified as such, that would be perfect. Hell, I'd rather have zero bot posts than the current state of Reddit.
TLDR: I vote no.
I think it should be left up to the mods of individual communities, maybe even better, keep it instance specific and other instances can de-federate if they feel it's a problem.
I subscribed [email protected], which uses a bot to repost from various places. But it was overwhelming to get every hackernews post added and I unsubscribed. Reposting everything from a big link aggregator is too much. But might be okay if just duplicating a smaller subreddit.
Then there is the problem that Reddit will likely fight your efforts to scrape it without paying api fees.
It'd certainly help grow Lemmy by helping with the migration (at least for me, as I don't want to go back to Reddit, but there are subs there with interesting info for me that won't migrate), but it shouldn't contain any direct links to Reddit, just credits to the OP.
I would like it, even if just for my instance. There are some subreddits that I liked, but I won't go back to reddit now.
I'm not entirely against the idea, but also really don't want to turn the fediverse into the flood of content that is reddit. Maybe some kind of relative upvote filter based bot. Only posts that get some certain percentage of above average number for a given sub, get scraped.
I've been toying with this idea at well. I don't think it's a good idea to scrape all content. This could drown out the lemmy-original content, especially when large subreddits are concerned. Maybe an upvote threshold (Only scrape if more than X upvotes) would be a good idea.
I would also scrape only the post itself, not the comments. Best to have our own organic discussions here.
Finally it should be very clear that a bot is posting these things. Ideally the bot would also ensure it is not re-posting something that was already posted by a Lemmy users just a bit earlier.
A couple of discord servers I'm on do that. It works pretty well, but only when the server is topic specific