this post was submitted on 28 Dec 2024
-44 points (24.4% liked)
Technology
60208 readers
2402 users here now
This is a most excellent place for technology news and articles.
Our Rules
- Follow the lemmy.world rules.
- Only tech related content.
- Be excellent to each another!
- Mod approved content bots can post up to 10 articles per day.
- Threads asking for personal tech support may be deleted.
- Politics threads may be removed.
- No memes allowed as posts, OK to post as comments.
- Only approved bots from the list below, to ask if your bot can be added please contact us.
- Check for duplicates before posting, duplicates may be removed
Approved Bots
founded 2 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
I think the fundamental issue is that you're assuming that information theory refers to entropy as uncompressed data but it's actually referring to the amount of data assuming ideal/perfect compression.
There are only 26 letters in the English alphabet, so fitting in a meaningful character space can be done in less than 5 bits (2^5 = 32). Morse code, for example, encodes letters in less than 4 bits per letter (the most common letters use fewer bits, and the longest use 4 bits). A typical sentence will reduce down to an average of 2-3 bits per letter, plus the pause between letters.
And because the distribution of letters in any given English text is nonuniform, there's less meaning per letter than it takes to strictly encode things by individual letter. You can assign values to whole words and get really efficient that way, especially using variable encoding for the more common ideas or combinations.
If you scour the world of English text, the 15-character string of "Abraham Lincoln" will be far more common than even the 3-letter string of "xqj," so lots of those multiple character expressions only convey a much smaller number of bits of entropy. So it might be that it takes someone longer to memorize a random 10 character string that is truly random, including case sensitivity and symbols and numbers, than it would to memorize a 100-character sentence that actually carries meaning.
Finally, once you actually get to reading and understanding, you're not meticulously remembering literally every character. Your brain is preprocessing some stuff and discarding details without actually consciously incorporating them into the reading. Sometimes we glide past typos. Or we make assumptions (whether correct or not). Sometimes when tasked with counting basketball passes we totally miss that there was a gorilla in the video. The actual conscious thinking discards quite a bit of the information as it is received.
You can tell when you're reading something that is within your own existing knowledge, and how much faster it is to read than something that is entirely new, on a totally novel subject that you have no background in. Your sense of recall is going to be less accurate with that stuff, or you're going to significantly slow down how you read it.
If you're preparing to be tested on the recall of each and every one of those things, you're going to find yourself reading a lot slower. You can read the entire reading passage but be totally unprepared for questions like "how many times did the word 'the' appear in the passage?" And that's because the way you actually read and understand is going to involve discarding many, many bits of information that don't make it past the filter your brain puts up for that task.
For some people, memorizing the sentence "Linus Torvalds wrote the first version of the Linux kernel in 1991 while he was a student at the University of Helsinki" is trivial and can be done in a second or two. For many others, who might not have the background to know what the sentence means, they might struggle with being able to parrot back that idea without studying it for at least 10-15 seconds. And the results might be flipped for different people on another sentence, like "Brooks Nader repurposes engagement ring from ex, buys 9-carat 'divorce ring' amid Gleb Savchenko romance."
The fact is, most of what we read is already familiar in some way. That means we're actually processing less information than we're actually taking in, and discarding a huge chunk of what we perceive towards what we actually think. And when we encounter things that didn't necessarily expect, we slow down or we misremember things.
So I can see how the 10-bit number comes into play. It cited various studies showing the image/object recognition tends to operate in the high 30's in bits per second, and many memorization or video game playing tasks involve processing in the 5-10 bit range. Our brains are just highly optimized for image processing and language processing, so I'd expect those tasks to be higher performance than other domains.
So in other words - if we highly restrict the parameters of what information we’re looking at, we then get a possible 10 bits per second.
Not exactly. More the other way around: that human behaviors in response to inputs are only observed to process about 10 bits per second, so it is fair to conclude that brains are highly restricting the parameters of the information that actually gets used and processed.
When you require the brain to process more information and discard less, it forces the brain to slow down, and the observed rate of speed is on the scale of 5-40 bits per second, depending on the task.
Not quite. Information always depends on context. It is not a fundamental physical quantity like energy. When you have a piece of paper with english writing on it, then you can read and understand it. If you don't know the script or language, you won't even be able to tell if it's a script or language at all. Some information needs to be in your head already. That's simply how information works.
You take in information through the senses and do something based on that information. Information flows into your brain through your senses and then out again in the form of behavior. The throughput is throttled to something on the order of 10 bits/s. When you think about it for a bit, you realize that a lot of things are predicated on that. Think of a video game controller. There's only a few buttons. The interface between you and the game has a bandwidth of only a few bits.