this post was submitted on 25 Aug 2024
316 points (97.9% liked)

Fuck AI

1450 readers
298 users here now

"We did it, Patrick! We made a technological breakthrough!"

A place for all those who loathe AI to discuss things, post articles, and ridicule the AI hype. Proud supporter of working people. And proud booer of SXSW 2024.

founded 8 months ago
MODERATORS
 

Meta has quietly unleashed a new web crawler to scour the internet and collect data en masse to feed its AI model.

The crawler, named the Meta External Agent, was launched last month, according to three firms that track web scrapers and bots across the web. The automated bot essentially copies, or “scrapes,” all the data that is publicly displayed on websites, for example the text in news articles or the conversations in online discussion groups.

A representative of Dark Visitors, which offers a tool for website owners to automatically block all known scraper bots, said Meta External Agent is analogous to OpenAI’s GPTBot, which scrapes the web for AI training data. Two other entities involved in tracking web scrapers confirmed the bot’s existence and its use for gathering AI training data.

While close to 25% of the world’s most popular websites now block GPTBot, only 2% are blocking Meta’s new bot, data from Dark Visitors shows.

Earlier this year, Mark Zuckerberg, Meta’s cofounder and longtime CEO, boasted on an earnings call that his company’s social platforms had amassed a data set for AI training that was even “greater than the Common Crawl,” an entity that has scraped roughly 3 billion web pages each month since 2011.

you are viewing a single comment's thread
view the rest of the comments
[–] riodoro1 77 points 3 months ago (3 children)

Fuck the planet, we need another one of those useless chatbots.

[–] [email protected] 42 points 3 months ago

Just another billion parameters bro! I swear if we add another billion it'll fix everything!

[–] [email protected] 15 points 3 months ago

the chatbots are there for them to pretend they're doing something useful for the end user, instead of just creating an ever-increasingly detailed unique digital profile of each individual with thousands of data points in order to separate you from your money

[–] [email protected] 4 points 3 months ago* (last edited 3 months ago)

Of course we do. A normal customer support agent of a random e-shop wouldn't write me a python script to send an email alert if my raspberry pi overheats!