this post was submitted on 08 Oct 2023
507 points (97.0% liked)

Technology

60016 readers
2696 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related content.
  3. Be excellent to each another!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, to ask if your bot can be added please contact us.
  9. Check for duplicates before posting, duplicates may be removed

Approved Bots


founded 2 years ago
MODERATORS
 

BBC will block ChatGPT AI from scraping its content::ChatGPT will be blocked by the BBC from scraping content in a move to protect copyrighted material.

you are viewing a single comment's thread
view the rest of the comments
[โ€“] Chreutz 10 points 1 year ago (1 children)

It's not one AI doing it in a big blob.

You ask ChatGPT something. It builds a web query. Another program returns search results. Then ChatGPT parses the list of results and chooses one to visit. The same program then returns the content of that page. Then ChatGPT parses that etc etc.

If the program (which is not an AI) that handles the queries and returns content is set to respect robots.txt, it will just not return the content to ChatGPT to be parsed.

[โ€“] [email protected] 2 points 1 year ago

Yup, it's essentially running behind a firewall