I am mainly hosting Jellyfin, Nextcloud, and Audiobookself. The files for these services are currently stored on a 2TB HDD and I don't want to lose them in case of a drive failure. I bought two 12TB HDDs because 2TB got tight and I thought I could add redundancy to my system, to prevent data loss due to a drive failure. I thought I would go with a RAID 2 (or another form of RAID?), but everyone on the internet says that RAID is not a backup. I am not sure if I need a backup. I just want to avoid losing my files when the disk fails.
How should I proceed? Should I use RAID2, or rsync the files every, let's say, week? I don't want to have another machine, so I would hook up the rsync target drive to the same machine as the rsync host drive! Rsyncing the files seems to be very cumbersome (also when using a cron job).
2 disks in the same machine is not a backup whether the data is copied between them using RAID or rsync or anything else.
Sounds like for this machine, just use the two disks in RAID1, or a ZFS mirror, or something. And figure out something else for backups. Probably a cloud solution.
Also, RAID2 requires a minimum of 3 disks, and is rarely used.
I'd argue it is a backup as long as something is doing snapshots of some kind to the other disk, and not realtime sync like raid. Obviously that should not be your only backup though.
3-2-1
https://www.backblaze.com/blog/the-3-2-1-backup-strategy/
It’s been literally a couple decades now but I once had to troubleshoot multiple RAID failures in a number of identical servers that were all running 6 disk RAID-5. Long story short the power supplies in each server was slowly losing its ability to power all the drives at the same time, so random drives started throwing errors. By the time we figured out the root cause, most of the drives had generated enough errors that the RAID controller couldn’t rebuild the volumes.
So, no, as others have said RAID is not a backup and should never be treated as such. A single point of failure like the power supply can easily cause the loss of the entire volume without warning.
It's a 'hot copy' or just 'copy' if you rsync/whatever the files. And they'll be gone too if the whole system fails due to power supply faulting, thunderstorm hitting the lines, misplaced coffee cup falling over, dropping the whole machine and so on...
If you make backups/snapshots they're not the same as just a copy, still useful for recovering from accidental deletion of files or something like that. Obviously should not be your only backup though.
My most common use of the local backups for my house is someone needs a file they deleted by accident or an older version of a file.
Yes, I get that. They can be very useful, specially if you share a NAS with family or something similar. At work the most common request for backup recovery is a user error, with a huge margin, so I guess you could call a separate copy a backup too, but like you said, it should not be the only copy. I'm personally a bit hesitant to call that a backup at all, but you do you, I'm not going to debate what qualifies.
3-2-1 is obviously the best approach, but (in my opinion) for the majority 2-1-1 (two copies, on a hard drives with one copy offsite) is enough, even if you run a small business, as long as the offisite copy is incremental, so that you can revert to an earlier date and mitigate ransomware as well as a user error which isn't immediately noticed.
In any case, the only fact I can rely with ~20 years of experience in the business is that hardware breaks. The only question is 'when', not 'if'. And no matter if you're a home gamer or a system architect for Meta, you need to plan how to mitigate that risk. Running everything on a single location with two separate hardware is better than having only one mainboard and from that you can mix and match whatever you want, limiting factors (mostly) being your time and wallet.