this post was submitted on 05 Jul 2023
74 points (90.2% liked)
Technology
62004 readers
3736 users here now
This is a most excellent place for technology news and articles.
Our Rules
- Follow the lemmy.world rules.
- Only tech related content.
- Be excellent to each other!
- Mod approved content bots can post up to 10 articles per day.
- Threads asking for personal tech support may be deleted.
- Politics threads may be removed.
- No memes allowed as posts, OK to post as comments.
- Only approved bots from the list below, to ask if your bot can be added please contact us.
- Check for duplicates before posting, duplicates may be removed
- Accounts 7 days and younger will have their posts automatically removed.
Approved Bots
founded 2 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
Anyone can set up a Lemmy instance, write a small script/bot to find and follow all the communities on all the instances in the Fediverse and store all that data. It's not even hard, maybe a day of work for a proof of concept if you start from zero. (Then you have to figure out how to scale it properly, how to detect you're getting defederated and how to change domains to restart without the defederations. Maybe a week's worth of effort.)
Threads would be way overkill to achieve this goal. You don't need any users. You don't want any users. Just your one account that follows everything.
Edit: or you can just set up a web crawler like Google Search uses to find and store all the data you're looking for, you don't necessarily to be federated / use ActivityPub