So, Lemmy is sometime missing content. I don't regret switching from Reddit to Lemmy but, expecially for niche communities, the content isn't always here.
My idea is to fix this is a Fediverse-based content relay named Relly.
Relly allows you to select RSS feeds, Mastodon users, Mastodon hashtag and Mastodon instances (so, the top posts on that instance) as sources for content, and post them to your favourite Lemmy community.
There are several features which make Relly better and anti-spam:
- Limits for a source (example: only up to 5 posts a day from this RSS feed)
- Limits for a community (example: only up to 5 posts a day to !archlinux)
- Global limits (example: only up to 10 posts made each day)
- Opt-out for servers & communities (instance and community moderators will be able to ask to be put in the UNLIST, which blocks by default Relly on your instace/community; this isn't an anti-spam, as it is more a tool for avoiding common users to use Relly in a malicous and spammy way)
- Order posts (so, if i have 10 RSS posts and 10 Mastodon posts and a global limit of 15 posts, you can either have the 10 RSS posts and the 5 most upvoted Mastodon posts, or some RSS posts and some Mastodon posts [always the most upvoted])
- Multiple communities (post the same content to different communieties, or set up a fraction [ex. 50%], so that each post has a certain percentage to be posted on a certain community)
- Dynamic limits: You can set an objective of active users/post made in the last 24 hours, so that the limits (either for a specific source, a specific community or globally) will be reduced. Example: if you set a objective of 50 posts, and 25 are made, the limits of Relly will be 50% of what they were originaly set to be; this allows Relly to completly stop posting on a community if the objective was already reached.
- Do not repeat: before posting a link, checks if it was already posted in the community in a specific time period (by default, 48 hours)
- Modularity: new post sources and post outputs can be implemented; an example could be an e-mail output, so that you can run Relly in local and recieve an e-mail everyday with your favourite news)
Relly is designed to be used by moderators of communities, but users can also use it. A user should always ask the moderator if it is OK to use it. A moderator should always ask the admins if it is OK to use it. Moderators, if they are the one using it, should also make public the list of sources, and allow the community to discuss possible edits to the list. The admins should put in the sidebar notes if Relly is OK to use for moderators of communities.
At the moment, Relly is just the idea that I presented here; I want to hear the community's feedback, and if the community is OK with this project being made, I will start working on it (I will make it in Rust and release under the MIT License).
I think the limited number of posts per day feature of this is really the standout that makes this intriguing to me. We already have the Lemmit bot posting every single post from reddit to Lemmy like a firehose, but discussion on them is sort of like yelling into a void. If we only post the top ~3 posts per day from a subreddit, we can condense any conversation into just those and guarantee that it's not going to get washed away with the rest of the junk content. Even though it's not ideal, I think a crutch like this could go a long way to seeding some "natural" activity.
Yes, and the fact that it doesn't post any link that was already posted in the last 48 hours avoid spamming.
I think that subreddits could be usable using the RSS feed system, as Reddit API are expensive and if we set up a RSS feed containing the top of 24 hours, we can extract links from there.
Is scraping reddit's HTML without using an API doable? I'm not sure if the reddit RSS feed has any notion of upvotes/popularity.
I had to enter reddit (eeewww..) but I found it: https://www.reddit.com/r/rss/comments/e3mx1j/how_to_get_rss_feed_of_a_subreddit_with_top_posts/ Check the first comment.