datahoarder
Who are we?
We are digital librarians. Among us are represented the various reasons to keep data -- legal requirements, competitive requirements, uncertainty of permanence of cloud services, distaste for transmitting your data externally (e.g. government or corporate espionage), cultural and familial archivists, internet collapse preppers, and people who do it themselves so they're sure it's done right. Everyone has their reasons for curating the data they have decided to keep (either forever or For A Damn Long Time). Along the way we have sought out like-minded individuals to exchange strategies, war stories, and cautionary tales of failures.
We are one. We are legion. And we're trying really hard not to forget.
-- 5-4-3-2-1-bang from this thread
view the rest of the comments
I have wondered is there an easy way to perform search through wayback machine for archived reddit data?
And for comments people back up to csv with stuff like power suite delete is there a nice way that displays them as opposed to excel?
It could be done, but that really isn't the best possible solution in my opinion. What I was thinking was having a bot migrate all the comments and posts here (or another instance). So the bot would take all the names of the users and replace them with the bot's names (instead of trying to create new users on lemmy) and put the old usernames in their comment. Like "Bread commented" and their comment. So we know who said it still.
If the bot maker had control of the instance, we probably might be able to put everything in chronological order by timestamp. So it would look like the comments were all made here orginally. The only indicator it wasn't would be the bot name as the username. So search algorithms would be able to search it just like reddit.
I believe the best way to archive a forum style website, would be on a forum where things have one to one equals.
As for moving Datahoarder to a new instance, that sure would make backups a lot nicer if a datahoarder ran it. I am surprised that it isn't on its own already considering the topic. Same thing with self-hosted.
I love this idea. It raises some issues to think about, too. Like, who “owns” that data? Would Reddit file a lawsuit against the Lemmy instance arguing that the data belongs to Reddit? Does the data belong to the users who posted? What TOS do we agree to when signing up for a Reddit account? Are we giving them ownership of all content we post?
I think it would be very hard to argue in court that someone's ideas and thoughts that they made belong to reddit just because they posted them there. That is also why you can request reddit delete all your data and they must comply.
As for the legality of taking those comments and posts. I don't know for certain. The internet archive already does though. If anything, they would have to remove any content that a person wants removed that they made. Like a DCMA request.
Like with most things on the internet, if it is illegal and nobody is enforcing it, it might as well be legal.
I'm not sure if it's possible to retrodate posts, not even if it's your own instance. But otherwise i think this might be the way.