this post was submitted on 14 May 2024
27 points (96.6% liked)

Selfhosted

40786 readers
1673 users here now

A place to share alternatives to popular online services that can be self-hosted without giving up privacy or locking you into a service you don't control.

Rules:

  1. Be civil: we're here to support and learn from one another. Insults won't be tolerated. Flame wars are frowned upon.

  2. No spam posting.

  3. Posts have to be centered around self-hosting. There are other communities for discussing hardware or home computing. If it's not obvious why your post topic revolves around selfhosting, please include details to make it clear.

  4. Don't duplicate the full text of your blog or github here. Just post the link for folks to click.

  5. Submission headline should match the article title (don’t cherry-pick information from the title to fit your agenda).

  6. No trolling.

Resources:

Any issues on the community? Report it using the report flag.

Questions? DM the mods!

founded 2 years ago
MODERATORS
 

I'm duplicating my server hardware and moving the second set off site. I want to keep the data live since the whole system will be load balanced with my on site system. I've contemplated tools like syncthing to make a 1 to 1 copy of the data to NAS B but i know there has to be a better way. What have you used successfully?

all 21 comments
sorted by: hot top controversial new old
[–] just_another_person 10 points 7 months ago* (last edited 7 months ago)

Rsync and rclone are the best options as mentioned in other comments. If you want to get real-time with it, and the previous cron-based solutions aren't what you want, look at the myriad of FOSS distributed filesystems out there. Plenty of live filesystems you can run on any Linux-based storage system.

I think the better question would be: what are you trying to achieve? Live replica set of all data in two places at the same time, or a solid backup of your data you can swap to if needed? I'd recommend the rsync/rclone route, and VPN from the primary data set whenever you need, with the safety of having your standby ready to swap out to whenever needed if the primary fails.

[–] _Analog_ 9 points 7 months ago (1 children)

syncthing falls down when you hit millions of files.

Platform agnostic? Rsync from the perspective of duplicating the client-presented data.

Or rclone, another great tool. I tend to use this for storage providers instead of between self hosted systems (or data center fully self-managed systems.)

If the NAS uses zfs then zfs send/recv is the best, because you can send only the changed blocks. Want to redo the names on every single movie? No problem! I do recommend sanoid or syncoid, I don’t remember which. ZFS snapshots are not intuitive, make sure to do LOTS of testing with data you don’t care about, and which is small.

In terms of truly duplicating the entire NAS, we can’t help without knowing which NAS you’re using. Or since this is selfhosted, which software you used to build a NAS.

[–] [email protected] 1 points 7 months ago

+1 for rclone

[–] [email protected] 7 points 7 months ago

Rsync or rclone are better ways than syncthing. Rsync can copy over ssh out of the box. Rclone can do the same but with a lot more backends. FTP, SSH, S3... It does not matter. Imo is rclone the better choice than rsync in this case. Take a look at rclone.org

[–] [email protected] 4 points 7 months ago (1 children)

Just have NAS A send a rocket with the data to NAS B.

[–] merthyr1831 4 points 7 months ago (1 children)

Rsync over FTP. i use it for a weekly nextcloud backup to a hetzner storage box

[–] [email protected] 4 points 7 months ago* (last edited 7 months ago) (1 children)

I suggest to use sftp/ssh with rsync instead. Much more secure then FTP.

[–] semperverus 1 points 7 months ago
[–] flux 3 points 7 months ago

What about rclone? I've found it to be amazing for cloning or copying.

[–] pyrosis 2 points 7 months ago

My favorite is using the native zfs sync capabilities. Though that requires zfs and snapshots configured properly.

[–] [email protected] 2 points 7 months ago

I want to keep the data live since the whole system will be load balanced with my on site system.

Is this intended to handle the scenario where you accidentally delete a bunch of important files and don't realize until the delete has synced, and deleted them on the remote site too? Consider using versioned backups too, to handle that case.

[–] [email protected] 1 points 7 months ago* (last edited 7 months ago)

Acronyms, initialisms, abbreviations, contractions, and other phrases which expand to something larger, that I've seen in this thread:

Fewer Letters More Letters
NAS Network-Attached Storage
SSH Secure Shell for remote terminal access
VPN Virtual Private Network
ZFS Solaris/Linux filesystem focusing on data integrity

4 acronyms in this thread; the most compressed thread commented on today has 5 acronyms.

[Thread #747 for this sub, first seen 14th May 2024, 02:55] [FAQ] [Full list] [Contact] [Source code]

[–] vegetaaaaaaa 1 points 7 months ago
  • rsync + basic scripting for periodic sync, or
  • distributed/replicated filesystems for real-time sync (I would start with Ceph)
[–] [email protected] 1 points 7 months ago (1 children)

Better options have already been mentioned. With that said another option might be torrenting.

[–] [email protected] 1 points 7 months ago

You would need to create a new torrent whenever new files are added or edited. Not very practical for continuous use.

[–] [email protected] 1 points 7 months ago

Sounds like you want a clustered filesystem like gpfs, ceph or gluster.

[–] [email protected] 1 points 7 months ago

Finding the right solution will depend entirely on what kind of load you're balancing.

[–] [email protected] -2 points 7 months ago (1 children)

If you want to mirror the entire system, OS and all, then clonezilla is the best option.

[–] [email protected] 1 points 7 months ago

That won't keep it constantly in sync though