this post was submitted on 25 Oct 2024
353 points (97.1% liked)

Curated Tumblr

4118 readers
3 users here now

For preserving the least toxic and most culturally relevant Tumblr heritage posts.

The best transcribed post each week will be pinned and receive a random bitmap of a trophy superimposed with the author's username and a personalized message. Here are some OCR tools to assist you in your endeavors.

-web

-iOS

-android

Don't be mean. I promise to do my best to judge that fairly.

founded 2 years ago
MODERATORS
 
you are viewing a single comment's thread
view the rest of the comments
[–] [email protected] 5 points 3 months ago* (last edited 3 months ago)

As per my other post, this person isn't doing any of that.

But, since you asked for papers on generic matching algorithms, I found this during the silent conniption fit you sent me into after suggesting that some random tumblr user plugged a tumblr bot directly into a state of the art genomics db.

https://link.springer.com/article/10.1007/s11227-022-04673-3

Please note that while, yes, they ran this test on a standard office computer, they were only searching against 12 million characters.

A single tebibyte of characters would be more like 1 trillion characters. A pebibyte would be more like 1 ~~quintillion~~ quadrillion.

... much, much, much longer processing times.

Edit: Used the wrong word for stupendously large numbers that start with q.