this post was submitted on 15 Aug 2023
132 points (95.8% liked)
Technology
59472 readers
4918 users here now
This is a most excellent place for technology news and articles.
Our Rules
- Follow the lemmy.world rules.
- Only tech related content.
- Be excellent to each another!
- Mod approved content bots can post up to 10 articles per day.
- Threads asking for personal tech support may be deleted.
- Politics threads may be removed.
- No memes allowed as posts, OK to post as comments.
- Only approved bots from the list below, to ask if your bot can be added please contact us.
- Check for duplicates before posting, duplicates may be removed
Approved Bots
founded 1 year ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
It would have to be always active, checking for radiation induced flips, not just powered off.
It should be fine for normal use cases when used with error correcting codes without any active scrubbing.
According error rates for ECC RAM (which should be at least by an order of magnitude comparable) of 1 bit error per gigabyte of RAM per 1.8 hours^1^, we would assume ~5000 errors in a year. The average likelyhood of hitting an already affected byte is approx. (5000/2)/1e9=2e-6. So that probability * 5000 errors is about a 1.2 percent chance that two errors occur in one byte after a year. It grows exponentially once you start going a past a year. But in total, I would say that standard error correcting codes should be sufficient to catch all errors, even if in hibernation for a whole year.
[1] https://en.wikipedia.org/wiki/ECC_memory
My initial thought was that everything would be stored in triplicate, then read in triplicate and 'voted' to the correct value, but I guess even that only extends the time before random bit-flips make the data unreadable. You're probably right on the need for active error checking if there is an intention to store anything long-term in this manner.
TMR (so the tripilicate method) wouldn't be super suitable for this kind of application since it is a bit overkill in terms of redundancy. Just from an information theory perspective, you should only have enough parity suitable for the amount of corruption you are expecting (in this case, not a lot, maybe a handful of bits after a year or two). TMR is optimal for when you are expecting the whole result to be wrong or right, not just corrupted. ECC and periodic scrubbing should be suitable for this. That is what is done by space-grade processors and RAM.
wrapped in gold like a satellite?
The gold around satellites are actually very thin layers of mylar, aluminum foil and kapton (a type of golden, transparent plastic) which are used to keep heat inside the satellite inside, and heat outside, outside (See Multi-Layer Insulation). Radiation shielding usually comes from the aluminum structural elements of the spacecraft, or is close to the electronics so you do not waste too much mass on shielding material. Basically, shielding efficacy is most determined by its thickness, so it quickly becomes quite heavy.
That's not gold, it's just a heat sheet.
I took this article specifically to mean, and that it was referring to, a new form of non-volatile solid state storage. Active memory is by definition, volatile. This article seems to be talking about non volatile RAM, fast enough to function as active RAM. This alone would redefine what a reboot is.