this post was submitted on 15 Oct 2024
1021 points (99.5% liked)

Technology

59887 readers
2844 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related content.
  3. Be excellent to each another!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, to ask if your bot can be added please contact us.
  9. Check for duplicates before posting, duplicates may be removed

Approved Bots


founded 2 years ago
MODERATORS
 

Wayback Machine back in read-only mode after DDoS, may need further maintenance.

top 50 comments
sorted by: hot top controversial new old
[–] dohpaz42 132 points 1 month ago (4 children)

Maybe it’s time to federate the IA.

[–] [email protected] 76 points 1 month ago (7 children)

One of the rare use cases of a blockchain actually being useful. A federated internet archive that uses a blockchain to validate that the saved data has not been altered by a malicious actor trying to tamper with proofs

That would be really cool but horribly inefficient because of the sheer amount of storage required

[–] [email protected] 123 points 1 month ago (1 children)

horribly inefficient

The core feature of all blockchain tech.

[–] [email protected] 13 points 1 month ago (1 children)

To be fair that would not necessarily be because of the blockchain part, more because of the decentralized/federated nature of this theorical network

[–] [email protected] 29 points 1 month ago (1 children)

Sure, but the networking and consent-finding are defining features of a blockchain. Nobody calls a git repo a blockchain.

[–] [email protected] 5 points 1 month ago* (last edited 1 month ago) (3 children)

You mean a "github repo". Git by itself doesn't give a hoot about validating authors what-so-ever (I could sign as "Bill Gates [email protected]", and git would happily accept the commit), and it's not federated (multiple people manually downloading various states of the repo at various times doesn't count).

Github ensures owners are who they are, as linked to their profile (though email validation only goes as far as "Well, they clicked the link in the email, so this must be their email account"). Github also isn't federated, since that one site going down takes all the repos with it (unless someone had it cloned, but again, random people downloading at random times yields different states of the repo, depending on when the clone/fetch occured, but then you'd end up with tens/hundreds/thousands of sources of various levels of truth).

[–] whostosay 7 points 1 month ago (1 children)
[–] [email protected] 4 points 1 month ago (1 children)

It's not a minor nitpick. The comment was that "nobody calls a git repo a blockchain". It's because it's not a blockchain, or even remotely similar to one.

[–] whostosay 4 points 1 month ago (1 children)

You are right, I was just poking fun a little. No hard feelings. You did just kind of um akshually my use of um akshually tho

load more comments (1 replies)
[–] Valmond 2 points 1 month ago (1 children)

Github is a website, controlled by no less than Microsoft lol.

A git repo can be spread out like a "blockchain" without the messy validation and coin earnings, maybe that was the intended comparison?

[–] [email protected] 2 points 1 month ago (1 children)

Could it be? Sure, I don't see a technological reason why someone couldn't build a system like that.

Are they now (federated, or blockchained)? No.

[–] Valmond 1 points 1 month ago (1 children)

True.

I'm working on a decentralised sharing protocol, but it uses reciprocal sharing so you'd have to have large storage anyways.

[–] [email protected] 1 points 1 month ago (1 children)

Hoof, yeah. Collaboration tools always seem to come down to bandwidth, storage, or both.

[–] Valmond 1 points 1 month ago (2 children)

You need to use something I guess :-) Any examples?

[–] [email protected] 2 points 1 month ago

Honestly, despite not actually being federated, I've been using raw Git a lot recently. As opposed to ActivityPub, you can always download the current state lf the central repo and bring yourself up to current. I just wish it were easier to store binary data in it (e.g. sharing my MP3s between my laptop and phone)

Of course, that's not a collaborative use-case. I have no intention of opening my files to the world. Just noting that ActivityPub has some pretty severe limitations (if my mbin server is offline, I wont get the updates I missed while it was down, ever. And if I can't process messages in real-time, I miss those too).

[–] [email protected] 1 points 1 month ago

Honestly, despite not actually being federated, I've been using raw Git a lot recently. As opposed to ActivityPub, you can always download the current state lf the central repo and bring yourself up to current. I just wish it were easier to store binary data in it (e.g. sharing my MP3s between my laptop and phone)

Of course, that's not a collaborative use-case. I have no intention of opening my files to the world. Just noting that ActivityPub has some pretty severe limitations (if my mbin server is offline, I wont get the updates I missed while it was down, ever. And if I can't process messages in real-time, I miss those too).

load more comments (1 replies)
[–] kautau 44 points 1 month ago (1 children)

I mean you don’t need the blockchain for that. The same way that distro mirrors don’t need the blockchain. It can be federated, with each upload being verified through hashes that they are in fact the real upload. I would argue that something like blockchain would remove the authority from them, granting the position of a bad actor spinning up enough servers to be able to poison the blockchain just because they had the computing power, claiming authority

[–] [email protected] 7 points 1 month ago (2 children)

Bro hear me out bro

We put the whole thing on a blockchain. BUT

  • entry order isn't super important

  • you don't need to validate the entire archive

So basically a blockchain, but for a bunch of files, not ordered. So instead of a native token, users can just trade bits of information as currency. 🙀

If it goes really well, we could even recruit one of the Bitcoin developers to help.

[–] kautau 8 points 1 month ago

lol I fucking hate this because idiots will read this and be like “oh shit is this the new blockchain”

Well done

load more comments (1 replies)
[–] RedStrider 13 points 1 month ago (2 children)
[–] kautau 5 points 1 month ago

Yes, this is a great example of where ipfs would work (specifically for file hosting, not necessarily for the actual web interface), and also, no ipfs is not a blockchain, and it shouldn’t be. I thought we were past the whole “can this be a blockchain” thing, but here we are. Blockchain is cool tech. It’s also incredibly inefficient for anything beyond a transaction ledger, or in today’s case, money laundering and trying to avoid taxes and regulation.

load more comments (1 replies)
[–] [email protected] 10 points 1 month ago (1 children)

The thing is sometimed articles must be removed from IA (copyright (I disagree with that one) or when information is leaked that could threaten lives), with a blockchain this would be impossible

[–] tehmics 4 points 1 month ago (2 children)

this would be impossible

Perfect.

I'd be interested in seeing real examples where lives are threatened. I find it unlikely that the internet archive would be the exclusive arbiter of so-called deadly information

[–] [email protected] 9 points 1 month ago (1 children)

There was an actual example where a journalistic article about afghanistan accidentally leaked names of some sources and people who helped westerners in afghanistan, which did actually endanger those people’s lives.

[–] tehmics 2 points 1 month ago (2 children)

If they're leaked, they're leaked. The archive doesn't change that one way or the other

load more comments (2 replies)
[–] [email protected] 3 points 1 month ago (1 children)

I thought of something but I don’t know if it’s a good example.

Here’s the hypothetical:

A criminal backs up a CSAM archive. Maybe the criminal is caught, heck say they’re executed. Pedos can now share the archive forever over encrypted messengers without fear of it being deleted? Not ideal.

[–] tehmics 2 points 1 month ago* (last edited 1 month ago)

Yeah this is a hard one to navigate and it's the only thing I've ever found that challenges my philosophy on the freedom of information.

The archive itself isn't causing the abuse, but CSAM is a record of abuse and we restrict the distribution not because distribution or possession of it is inherently abusive, but because the creation of it was, and we don't want to support an incentive structure for the creation of more abuse.

i.e. we don't want more pedos abusing more kids with the intention of archival/distribution. So the archive itself isn't the abuse, but the incentive to archive could be.

There's also a lot of questions with CSAM in general that come up about the ethics of it in that I think we aren't ready to think about. It's a hard topic all around and nobody wants to seriously address it beyond virtue signalling about how bad it is.

I could potentially see a scenario where the archival could be beneficial to society similar to the FBI hash libraries Apple uses to scan iCloud for CSAM. If we throw genAI at this stuff to learn about it, we may be able to identify locations, abusers and victims to track them down and save people. But it would necessitate the existence of the data to train on.

I could also see potential for using CSAM itself for psychotherapy. Imagine a sci-fi future where pedos are effectively cured by using AI trained on CSAM to expose them to increasingly mature imagery, allowing their attraction to mature with it. We won't really know if something like that is possible if we delete everything. It seems awfully short sighted to me to delete data no matter how perverse, because it could have legitimate positive applications that we haven't conceived of yet. So to that end, I do hope some 3 letter agencies maintain their restricted archives of data for future applications that could benefit humanity.

All said, I absolutely agree that the potential of creating incentives for abusers to abuse is a major issue with immutable archival, and it's definitely something that we need to figure out, before such an archive actually exists. So thank you for the thought experiment.

[–] [email protected] 2 points 1 month ago

We don't need a blockchain for that.

Having multiple servers which store file checksums would have much less overhead, would be easily repeatable and appendable, with no need for unnecessary computational labor. Linux mint currently uses the checksum process for verifying that an ISO downloaded is not altered in any way, and it can work for any file (preferably not humongous files).

Strive for K.I.S.S. whenever possible.

load more comments (1 replies)
[–] [email protected] 18 points 1 month ago (1 children)

I don't know if that's a good idea.

How would you go about implementing the infrastructure for that?

[–] dohpaz42 2 points 1 month ago

That’s an excellent question. Unfortunately I do not have an answer. But I believe it’s worth discussing some means of redundancy for the IA; even if it’s as simple as rsync to other hosts.

[–] [email protected] 7 points 1 month ago

They’ve been using Filecoin

load more comments (1 replies)
[–] felixwhynot 48 points 1 month ago (3 children)

A commenter on Ars suggested donating, so I did. You can too with this link! https://www.paypal.com/paypalme/internetarchive

[–] [email protected] 17 points 1 month ago (1 children)

I verified this is indeed the method listed on the Internet Archive website.

[–] [email protected] 3 points 1 month ago (1 children)

Nice. Wouldn’t want money going to |nternetArchive!

[–] [email protected] 3 points 1 month ago

Lol it just seemed odd to me it was a direct paypal link so i went to check.

[–] [email protected] 5 points 1 month ago (1 children)

I need to do this again. I donated last year, but it's one of my favorite and one pretty important site.

[–] douglasg14b 2 points 1 month ago
[–] [email protected] 2 points 1 month ago
[–] [email protected] 45 points 1 month ago

It's worth noting that the saved pages are the only thing that are back for now. Their other services have not yet been brought back online.

[–] FlyingSquid 36 points 1 month ago* (last edited 1 month ago)

This absolutely made my morning.

Edit: Never mind, already knew about the Wayback machine. I thought it was the rest of the archive.

Still good news.

[–] mesamunefire 31 points 1 month ago

Such good news!

[–] credo 18 points 1 month ago (1 children)

Okay, which one is missing?

load more comments (1 replies)
[–] [email protected] 12 points 1 month ago (4 children)

I realize it's like the least important aspect of this, but yay! My podcast is back! I listen to Lawrence Manzo's Mahabharata podcast every night to go to sleep, and I haven't slept well since the attack

[–] [email protected] 32 points 1 month ago (1 children)

If you rely on it that much maybe its time to download it all and keep it.

[–] [email protected] 3 points 1 month ago (1 children)

I honestly don't know how I'd get it until it comes back. I can download through the podcast app, but until then, to my knowledge, it's completely lost anywhere other than archive.org Even the original blog it was posted to back in 2010 doesn't have the audiofiles anymore, just links to the archive.org

[–] vortexsurfer 11 points 1 month ago (1 children)

It should be possible to download the audio files directly from archive.org, using a browser.

[–] [email protected] 1 points 1 month ago

Once it's back, I'll definitely do that. It's still not available as of yet.

load more comments (3 replies)
load more comments
view more: next ›