this post was submitted on 29 Jun 2023
236 points (96.8% liked)

Technology

59682 readers
3742 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related content.
  3. Be excellent to each another!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, to ask if your bot can be added please contact us.
  9. Check for duplicates before posting, duplicates may be removed

Approved Bots


founded 1 year ago
MODERATORS
 

The lawsuit alleges OpenAI crawled the web to amass huge amounts of data without people's permission.

you are viewing a single comment's thread
view the rest of the comments
[–] SamB 30 points 1 year ago (2 children)

I doubt it’s only about some Reddit posts. The scrapping was done on the whole web, capturing everything it could. So besides stealing data and presenting it as its own, it seems to have collected some even more problematic data which wasn’t properly protected.

[–] zekiz 23 points 1 year ago (4 children)

But that really isn't OpenAI's fault. Whoever was in charge of securing the patients data really fucked up.

[–] [email protected] 24 points 1 year ago

Leaving your front door open isn't prudent but doesn't grant permission to others to enter and take/copy your belongings or data.

The security teams may have royally screwed up, but OpenAI has a legal obligation to respect copyright and laws regarding data ownership.

Likewise, they could have scraped pages that included terms of use, copyright, disclaimers, etc., and failed to honor them.

All parties can be in the wrong for different reasons.

[–] almar_quigley 14 points 1 year ago (4 children)

That’s like saying you didn’t lock your front door so whoever robs you is innocent.

[–] [email protected] 6 points 1 year ago (1 children)

I think it's a little closer to being mad that the Google street car drove by and snapped a picture of the front of your house, tbh.

[–] almar_quigley 1 points 1 year ago

Except pii and spi are protected under law, just like your possessions.

[–] Dran_Arcana 6 points 1 year ago (1 children)

But does leaving your front door open allow one to legally take a picture of the inside from across the street? I'd say scraping is more akin to that than it is theft. Nothing is removed in scraping, just copied

[–] BradleyUffner 2 points 1 year ago

Bad analogy. This is like leaving your couch out on the sidewalk, then complaining when someone takes a picture of it.

[–] zekiz 5 points 1 year ago

It's more like leaving an important letter in the open for everyone to read. It's certainly your fault for leaving it that open.

[–] MercuryUprising 2 points 1 year ago

Yeah, but what were all these people whose data was scraped wearing?

[–] [email protected] 7 points 1 year ago (1 children)

It’s certainly their fault that they used it, though.

If they cared, they could have ensured they weren’t using sensitive or otherwise highly problematic information, but they chose not to. That’s on them.

[–] MercuryUprising -3 points 1 year ago (2 children)

It's called "disrupting" the established norms. You wouldn't get it because you're not on the bleeding edge of a revolutionary platform that's seeing scalable vertical growth due to its paradigm shift.

[–] [email protected] 4 points 1 year ago (1 children)

You forgot to mention something about blockchain

[–] assassin_aragorn 1 points 1 year ago

I can't see AI as anything but the next crypto. It seems incredibly overhyped to me

[–] [email protected] 3 points 1 year ago

My sarcasm detector is making strange noises. We may have a false positive here!

[–] [email protected] 1 points 1 year ago

They certainly fucked up, but it might well be OpenAI's post too.

[–] tallwookie 8 points 1 year ago (1 children)

if it was unsecured it's basically public. whomever put that data on a publicly accessible server is at fault

[–] [email protected] 10 points 1 year ago* (last edited 1 year ago) (2 children)

That's not necessarily true. Even if a company makes the mistake of not securing data correctly, those that make use of this data can still be at fault.

If a company leaves a server wide open, you still can't legally steal information from it.

[–] tallwookie 1 points 1 year ago

that's kind of a grey area - digitally copying something that's public domain isnt stealing.

[–] [email protected] 0 points 1 year ago

undefined> If a company leaves a server wide open, you still can’t legally steal information from it.

I don't see how this is any different than if Google search included text from a page that shouldn't be public.