this post was submitted on 05 Oct 2023
1186 points (98.2% liked)

Technology

58134 readers
5330 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related content.
  3. Be excellent to each another!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, to ask if your bot can be added please contact us.
  9. Check for duplicates before posting, duplicates may be removed

Approved Bots


founded 1 year ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
[–] Smokeydope 228 points 11 months ago* (last edited 11 months ago) (40 children)

This is a copy/pasted message I wrote up on another thread. As long as there are people in the comments shilling kagi, I will shill my prefered engines. At least my suggestions will bring awareness to free as in freedom projects. I hope to god people paying 10$/month just to not get datacucked by search engines will also learn something and save their money.

SearX/SearXNG is a free and open source, highly customizable, and self-hostable meta search engine. SearX instances act as a middle man, they query other search engines for you, stripping all their spyware ad crap and never having your connection touch their servers. Of course you have to trust the SearX instance host with your query information, but again if you are that paranoid just self host.

I personally trust some foss loving sysadmin that host social services for free out of alturism, who also accepts hosting donations, whos server is located on the other side of the planet, with my query info over Google/Alphabet any day.

Its nice to be able to email and have a human conversation with your search engine provider thats just a knowlegable every day joe who genuinely believes in the project and freely dedicates their resources to it. Consider sending some cash their way to help with upkeep if you like the services they provide, they will probably appreciate and make use of that 10$ better than kagi.

Heres a list of all public searx instances, I personally prefer to use paulgo.io All SearX instances are configured different to index different engines. If one doesn't seem to give good results try a few others.

Did I mention it has bangs like duckduckgo? If you really need google like for maps and buisness info just use !!g in the query

search.marginalia.nu is a completely novel search engine written and hosted by one dude that aims to prioritize indexing lighter websites little to no javascript as these tend to be personal websites and homepages that have poor SEO and the big search engines won't index well. If you remember the internet of the early 2000s and want a nostalgia trip this ones for you. Its also open source and self-hostable

Finally, YaCy is another completely novel search engine that uses peer-to-peer technology to power a big webcrawler which prioritizes indexes based off user queries and feedback. Everyone can download yacy and devote a bit of their computing power to both run their own local instance and help out a collective search engine. Companies can also download yacy and use it to index their private intranets.

They have a public instance available through a web portal. To be upfront, YaCy is not a great search engine for what most people usually want, which is quick and relevant information within the first few clicks. But, it is an interesting use of technology and what a true honest-to-god community-operated search engine looks like untainted by SEO scores or corporate money-making shenanigans.

I hope this has been informative to those who believe theres only a few options to pick from, I know these options are so unknown to most people.

[–] doktorseven 7 points 11 months ago* (last edited 11 months ago) (3 children)

When you need a scalable service for tons of users, federated isn't going to cut it. This is why Apple wants DDG. Point the bajillion crApple lusers at one of your public instances (or even all of them chosen at random each time) and watch it crash and burn overnight. DDG has tons of servers and the infrastructure to hold up while a ton of people search why their luxury device is slowing down every time Apple releases a new one.

[–] dm_me_your_feet 0 points 11 months ago* (last edited 11 months ago) (2 children)

Lol Federation is the definition of scalable. Everyone serves their local users -> a miniscule amount of global traffic, everything but auth always stays local.

Universities have been doing it since the beginning of the internet. Email is the biggest example but there are others: eduGAIN and eduroam are the most notable ones coming out of the academic community.

[–] doktorseven 2 points 11 months ago (1 children)

You are confusing a network of distinct servers with a single point of entry that a search engine would need to be. There is no fallback or distribution of search when everything is directed to a single search point, and pointing people to different search sites per search will remove any per-site preferences.

Do people think about what they say any more, or do they see someone who is trying to carefully explain their problem and just go into pure rage and try to disprove them by spewing things that do not make any sense?

[–] dm_me_your_feet 1 points 11 months ago* (last edited 11 months ago)

No search engine has a "single point of entry". Every search engine has Cache servers all over the world at almost every major IXP. Nothing would prevent a federated service from operating the same way. Cloudflare or literally any form of loadbalancer or load balancing service could be used to redirect queries to fedisearch (or whatever the service name would be) to the local instance by IP geolocation. Authentication can just be forwarded to the home server via SAML, thats also where the settings can be stored and queried at login time by the local instance. SAML assertions are very scalable, and there needs to be no global login server, since every users login query can be forwarded to his home instance, where his profile is loaded. The full search index could be put into a blockchain that every local instance joins - every instance crawls their area and publishes new results to the chain. You seem to know very little about how the internet works, yet you accuse me of raging.

That the foss community can manage things like that has been proven for years. Debian mirror server network works in a similar way (they run their own loadbalancer ofc), while being cryptographically secure. And if you wanna see a federated login network like i described in action, just go to https://pubs.acs.org/action/ssostart

All these parts i described are existing technology and in global use. The combination is not, but there is nothing that would prevent a foundation from implementing search like this.

load more comments (36 replies)