I know a lot of sites now use browser fingerprinting and the like in order to determine how likely a user is to be a bot. The modern web tracks a lot of information about users, and all of that can be used to gauge how 'human' the user is, though this does raise some other concerns. A sufficiently stalkerish site already knows if you're human or not.
This CGP Grey video is great, and covers how many captchas are often used to train the bots. https://www.youtube.com/watch?v=R9OHn5ZF4Uo
Speaking of, how are regulators / governments going to deal with Lemmy? Virtually all existing legislation is intended to deal with centralized stuff run by companies, not federalized. By some regards, there may be actual legal issues with the current setup.
Lemmy by its nature is unlikely to ever face the scrutiny that corporate-owned platforms do, but that doesn't mean we should be unprepared.
Edit: ...virtually all existing legislation...