"3! 2! 1!" Is just what I say when doing some potentially deleterious action after rsyncing a few key directories to a separate volume
Selfhosted
A place to share alternatives to popular online services that can be self-hosted without giving up privacy or locking you into a service you don't control.
Rules:
-
Be civil: we're here to support and learn from one another. Insults won't be tolerated. Flame wars are frowned upon.
-
No spam posting.
-
Posts have to be centered around self-hosting. There are other communities for discussing hardware or home computing. If it's not obvious why your post topic revolves around selfhosting, please include details to make it clear.
-
Don't duplicate the full text of your blog or github here. Just post the link for folks to click.
-
Submission headline should match the article title (don’t cherry-pick information from the title to fit your agenda).
-
No trolling.
Resources:
- selfh.st Newsletter and index of selfhosted software and apps
- awesome-selfhosted software
- awesome-sysadmin resources
- Self-Hosted Podcast from Jupiter Broadcasting
Any issues on the community? Report it using the report flag.
Questions? DM the mods!
3 sticky notes telling me to "go get that incremental backup working",
2 separate external hard drives,
1 month out of date
Same lol. Can’t be that catastrophic. Right? …. Right?
borgmatic is way too easy and hetzner storage box is way too cheap to have any excuses
A usb stick and an old hard drive from 2009. The crackhead way of dealing with backups.
I dump my encrypted data to someone who probably practices 3-2-1 rule (which is Backblaze for me). I mean, these guys back up data for a living.
- Primary ZFS pool with automatic snapshots
- Provides 3+ copies of the files via snapshots (3)
- Secondary ZFS pool at a different location replicates the primary
- Provides more copies of the files (3)
- Provides second media (2)
- Is off-site (1)
Does this make sense?
I don't think this meets the definition of 3-2-1. Which isn't a problem if it meets your requirements. Hell, I do something similar for my stuff. I have my primary NAS backed up to a secondary NAS. Both have BTRFS snapshots enabled, but the secondary has a longer retention period for snapshots. (One month vs one week). Then I have my secondary NAS mirrored to a NAS at my friends house for an offsite backup.
This is more of a 4-1-1 format.
But 3-2-1 is supposed to be:
-
Three total copies of the data. Snapshots don't count here, but the live data does.
-
On two different types of media. I.e. one backup on HDD and another on optical media or tape.
-
With at least one backup stored off site.
I've always understood 2 as 2 physically different media - i.e., copies in different folders or partitions of the same disk is not enough to protect against failure of that disk, but a copy on a different disk does. Ideally 2 physically different systems, so failure/fire in the primary system won't corrupt/damage the backup.
Used to be that HDDs were expensive and using them as backup media would have been economically crazy, so most systems evolved backup media to be slower and cheaper. The main thing is that having /home/user/critical, /home/user/critical-backup, and /home/user/critical-backup2 satisfies 3 copies, but not 2 media.
Hm I wonder why snapshots wouldn't satisfy 3. Copies on the same disk like /file, /backup1/file, /backup2/file should satisfy 3. Why wouldn't snapshots be equivalent if 3 doesn't guard against filesystem or hardware failure? Just thinking and curious to see opinion.
If I'm reading your example right, I don't think that would satisfy three either. Three copies of the data on the same filesystem or even the same system doesn't satisfy the "three backups" rule. Because the only thing you're really protecting against is maybe user error. I.e. accidental deletion or modification. You're not protecting against filesystem corruption or system failure.
For a (little bit hyperbolic) example, if you put the system that has your live data on it through a wood chipper, could you use one of the other copies to recover your critical data? If yes, it counts. If no, it doesn't.
Snapshots have the same issue, because at the root a snapshot is just an additional copy of the data. There's additional automation, deduplication, and other features baked into the snapshot process but it's basically just a fancy copy function.
Edit: all of the above is also why the saying "RAID is not a backup" holds true.
Right so I guess the question of 3 is whether it means 3 backups or 3 copies. If we take it literally - 3 copies, then it does protect from user error only. If 3 backups, it protects against hardware failure too.
E: Seagate calls them copies and explicitly says the implementer can choose how the copies are distributed across the 2 media. The woodchipper scenario would be handled by the 2 media requirement.
My current plan once new migration is completed:
Primary pool - 1x ZFS (couldn't afford redundancy but no different to my RPI server). My goal is to get a few more drives and set up a RAIDZ1/2.
Weekly backup of critical data (eg. nextcloud) from primary pool to a secondary pool. Goal here is to get a mirror but will only be one drive for now.
Weekly upload of secondary pool to hetzner storage box via rsync.
Current server
1x backup to secondary drive (rpi) 1x backup to hetzner storage box via rsync
my backup is staring longingly at LTO drives and wishing they would magically be affordable.
4-2-1-1 for me I guess 🫣 or 4-2-2?
Two copies at home, synced daily, one of them in an external drive that I like to refer as the emergency grab and run copy lol
One at a family member synced weekly and manually every time I visit.
All of those three copies are always within a 10 kilometer radius in a valley overseen by a volcano so..
One partial copy of the so-critical-would-cry-if-Iost data is synced every few days to a backblaze bucket.
DO NOT follow my lead, my backup solution is scuffed at best.
3:
I have:
- RAID1 array w/ 2 drives
- Photos on the device that took them
- Photos on a random old hard drive pulled from an ancient apple mac.
2:
I've got a hard drive and flash memory?
1:
Don't have this at all, the closest is that my phone is off-site half of the day.
Real selfhosters know
I use Proxmox Backup Server for my backups. Everything backups to 1 system at home. I then sync the data store to a little NAS I have at a family members house across town and also to a cheap storage VPS on the other side of the country. I also do a manual sync of the data store to a single external drive that I manually connect and disconnect.
None of my data hoarding files are backed up as that would cost way too much. That could change if I ever find a killer deal on an LTO8 or better drive and tapes.
I know that Hetzner has some decently priced Storage Boxes that you can mount using rclone and then backup to. Keep in mind that latency will be a factor so it could be slow.
I rawdog storage. I RAID0 and forget. huehue.
Toss in another drive for RAID5. That way you can at least have some redundancy...
It's not important data. Why would I spend another $200+ for another 20TB drive to have redundancy for 1 and 0 I don't care about...
Fair point.
2x 2TB with rsync
2X 500GB with rsync.
1x 1TB cloud drive via rsync
The 2TB has a 500GB dir that gets cloned to the other 2 500GB drives and the cloud.
4 drives, 2 locations (1 offsite)
I could spare 500gb portion somewhere I guess but it's just easy atm that the important 500GB gets copied around 1x a week.
1 backup on a local, Independence disk. 1 backup on a HDD connected to an OpenWRT router at the other end of the house 1 backup on my remote vps.
Restic+backrest
Sftp for remote endpoint
All persistent storage from my dockers are in a folder. All I have to backup everything is backup this one folder along with my docker compose files (in git).
Locally there are zfs snapshots (autosnapshot) and for remote I use borgmatic.
Borg to :
- Local server
- Friends server
- Borgbase
Atm main sys is a ZFS RAIDZ1 on 3 SSDs
Weekly-ish backup onto 1TB external HDD.
Sync encrypted important stuff to Cloud.
Syncthing some stuff to smartphone.
All storage is on a Ceph cluster with 2 or 3 disk/node replication. Files and databases are backed up using Velero and Barman to S3-compatible storage on the same cluster for versioning. Every night, those S3 buckets are synced and encrypted using rclone to a 10tb Hetzner Storage Box that keeps weekly snapshots.
Config files in my git repo:
https://codeberg.org/jlh/h5b/src/branch/main/argo/external_applications/velero-helm.yaml
https://codeberg.org/jlh/h5b/src/branch/main/argo/custom_applications/bitwarden/database.yaml
https://codeberg.org/jlh/h5b/src/branch/main/argo/custom_applications/backups
https://codeberg.org/jlh/h5b/src/branch/main/argo/custom_applications/rook-ceph
Bit more than 3 copies, but hdd storage is cheap. Majority of my storage is Jellyfin anyways, which doesn't get backed up.
I'm working on setting up some small nvme nodes for the ceph cluster, which will allow me to move my nextcloud from hdd storage into its own S3 bucket with 4+2 erasure coding (aka raid 6). That will make it much faster and also its cut raw storage usage from 4x to 1.5x usable capacity
I've a nightly cronjob that runs backup using rsync for my local, and an external HDD that I stash in my work locker that I bring home once a week or so to connect to the server, run a backup script (more rsync), then take it back to work. It's not super sophisticated, but it works, and I have tested and restored from both the local and offsite backups.
Currently only have pictures and documents stored, so everything easily fits on 1tb. One copy on my homeserver (unencrypted), one copy on my laptop (Luks encrypted), and one copy with rsync and a raspi at my parents (unencrypted). Might change encryption strategies to all luks.
All my video media that's easier to replace than preserve is on my NAS running openmediavault with mergerfs. If I lose a drive I can always just, you know, torrent the tv show again.
My main PC (everything except the Steam game install directory) is backed up through KopiaUI to a folder on that mergerfs array that contains media that's difficult/impossible to replace. Daily incremental backups.
That folder is mounted on my PC through DOKAN, which tells Windows OS that it's a local resource (it does this more thoroughly than just assigning a drive letter to a NAS folder through Windows' built-in system). The PC, including the "sensitive NAS media" folder, is then backed up to Backblaze's personal backup service ($99/yr, unlimited size with one-year versioning). The DOKAN step is required for this, since Backblaze doesn't support mounted NAS drives or non-Windows systems (presumably they don't want to use space on versioned encrypted backups of hundred-terabyte pirate movie collections).
Oh, and my phone does one-way Syncthing to my PC, thus putting its files on the PC for Kopia and Backblaze to do their thing.
I use immich and nextcloud for the clients (my wife and my parents know that I only take care about that data) and on the server side I use borgmatic which has a local repository on the second drive inside my nuc and a remote repository hosted by hetzner called "storage box" which supports borg native.
Yes the remote is out of my physical access, but borg is fully encrypted and for 4$/3.6€/month for 1TB I feel good.
Before I started with borg and hetzner I had a rsync based backup with an odroid hc1 hosted by my parents, but that doesn't feel safe. Due to slow network by my parents I had to sync my local backup instead of a second backup from the real data and the monitoring was also very bad.
From my point of view: You have no backup, if it is not automated and you have no monitoring.
Sometimes: a laughing hyena.
If you don't have tested backups, you don't have a backup.
I use Kopia to B2, then on a monthly basis I copy the current Kopia repo to an external drive that's otherwise kept offline in my house.
Wow, a lot of variation in this thread!
I get all my data to my server, then from there I have borgmatic do incremental backups to a backup drive on the same machine (nightly cronjob).
From there I use Rclone to get the encrypted borg backup to Backblaze B2 for cloud storage.
So for 3 2 1, my 3 copies are the original, the local backup, and the cloud backup.
My 2 media are local hard drives and cloud storage (I think it's fair to consider this a different kind of media).
And my 1 offsite is the cloud backup.
Now I'm dumb and have a fear of screwing something up so I have also started burning M-Discs of my critical data (everything except TV/movie/music stuff I can redownload). Though this was a lot more expensive than I was expecting, because of aforementioned me being dumb I already screwed up two discs (they are write once). I'm also doing two copies of each disc.
Also I have photos/home videos additionally stored in ente, they are super important to me and I wanted a separated copy someone else is looking after.
I use Backblaze B2 for one offsite backup in "the cloud" and have two local HDDs. Using restic with rclone as storage interface, the whole thing is pretty easy.
A cronjob makes daily backups to B2, and once per month I copy the most current snapshot from B2 to my two local HDDs.
I have one planned improvement: Since my server needs programmatic access to B2, malware on it could wipe both the server and B2, leaving me with the potentially one-month old local backups. Therefore I want to run a Raspberry Pi at my parents' place that mirrors the B2 repository daily but is basically air-gapped from the server. Should the B2 repository be wiped, the Raspberry Pi would still retain its snapshots.
My nas is a second copy of all my data, nothing only exists on the nas. The nas is also is slowly uploading to backblaze, data limits are slowing my progress. My photos which I feel are the least replaceable are automatically backed up to my nas , Google photos, and amazon photos, with manual backup to my desktop, and manual backup to an external hard drive that is stored in a fire resistant box.
My main server is backed up via Kopia to a 5 TB Hetzner Storage Box and to a second server at my parents in law‘s place. I‘ve got additional MDisc backups of old photos, Paperless PDFs and work related files that don‘t change at my mother‘s place as well.
My Linux ISO collection is too big to actually back up. So, I regularly create file lists and in the event of data loss, I will have to spend quite some time to rebuild it. At least, my fiber connection will help me with that.
3: RAID-1 pair + manual periodic sync to an external HD, roughly monthly. Databases synced to cloud.
2: external HD is unplugged when not syncing
1: External HD is a rotating pair, swapped in a bank box, roughly quarterly. Bank box costs $45/year.
If the RAID crashes, I lose at most a month. If the house burns down, I lose at most 3 months. Ransomware, unless it's really stealthy, I lose 3 months. If I had ongoing development projects, a month (or 3) would be a lot to lose, and I'd probably switch to weekly syncs and monthly swaps, but for what I actually do - media files, financial and smart-home data, 3 months would not be impossible to recreate.
All of this works because my system is small enough to fit on one HDD. A 3-2-1 system for tens of TB starts to look a lot like an enterprise system.
- User devices (main workstations, phone photos, etc)
- Local NAS (sync w/ #1, backup to #3)
- Cloud backup w/ commercial provider
Everything backs up to a Synology diskstation (with disk redundancy). The Syno's Hyperbackup makes backups of critical stuff stuff to the cloud weekly. In the case of my self-hosted stuff, it's mostly the share storage where all my docker volumes map to. Also workstation backsups, home assistant backups, phone photos, etc.
A back up of the temporally replaceable stuff (everything not covered above) which is hosted from the Diskstation, is made to an external drive a few times a year and stored off-site the rest of the time. This isn't 3-2-1, but its close enough for my needs.
All my systems are backed up with "rsnapshot" to a file server. File server is backed up to backblaze with duplicacy.
iDrive e2 with duplicati and manually to an external SSD with rscyn every so often.
I was planing on asking a friend to setup a server at their home, but I feel somewhat comfortable with the current solution.
My day-to-day stuff stays in sync via syncthing on my two laptops, my desktop and my home server. They all run btrfs, so I won't be syncing any flipped bits around.
Home server rsyncs from my VPS once a week. When that's fine, it rsyncs itself over to a hetzner storage over sshfs+gocryptfs.
Four copies at home, one in the cloud.
My main storage is a mirrored pair of HDD. Versioning is handled here.
It Syncthings an "important" folder to a local back up only 1 HDD.
The local Backup Syncthings to my parents house with 1 SSD.
My setup can be better, if I put the versioning on my local backup it'd free space on my main storage. I could migrate to a dedicated backup software, Borg maybe, over syncthing. But Syncthing I knew and understood when I was slapdashing this together. It's a problem for future me.
I've been seriously considering an Elitedesk G4 or Dell/Lenovo equivalent as back up machines. Mirrored drives. Enough oomph to HA the things using the "important" files: immich paperless etc.
My one other media type is “the cloud”.
I use hard drives, I can’t imagine trying to put something on a disk or something.
One thing I do recommend, I keep one unencrypted hard drive copy in the safest most hidden part of my house. This is in case encryption software disappears, or I just forget my encryption keys or something.
Other than that, one encrypted copy of files in a thumb drive in my wallet (selected files, not everything). One in my car. One in my firesafe. Then daily cloud backup.