this post was submitted on 07 Feb 2025

257 points (99.6% liked)

Fediverse

29796 readers

1891 users here now

A community to talk about the Fediverse and all it's related services using ActivityPub (Mastodon, Lemmy, KBin, etc).

If you wanted to get help with moderating your own community then head over to [email protected]!

Rules

Posts must be on topic.
Be respectful of others.
Cite the sources used for graphs and other statistics.
Follow the general Lemmy.world rules.

Learn more at these websites: Join The Fediverse Wiki, Fediverse.info, Wikipedia Page, The Federation Info (Stats), FediDB (Stats), Sub Rehab (Reddit Migration), Search Lemmy

founded 2 years ago

MODERATORS

257

FediDB has stopped crawling until they get robots.txt support (lemmy.world)

submitted 1 day ago* (last edited 11 hours ago) by mesamunefire to c/fediverse

46 comments fedilink hide all child comments

We have paused all crawling as of Feb 6th, 2025 until we implement robots.txt support. Stats will not update during this period.

you are viewing a single comment's thread
view the rest of the comments

[–] [email protected] 6 points 1 day ago (1 children)

They do have a dedicated "Crawler" page.

And they do mention there that they use a website crawler for their Developer Tools and Network features.

[–] [email protected] 4 points 1 day ago

Maybe the definition of the term "crawler" has changed but crawling used to mean downloading a web page, parsing the links and then downloading all those links, parsing those pages, etc etc until the whole site has been downloaded. If there were links going to other sites found in that corpus then the same process repeats for those. Obviously this could cause heavy load, hence robots.txt.

Fedidb isn't doing anything like that so I'm a bit bemused by this whole thing.