this post was submitted on 14 Aug 2024
57 points (98.3% liked)

Selfhosted

40960 readers
1304 users here now

A place to share alternatives to popular online services that can be self-hosted without giving up privacy or locking you into a service you don't control.

Rules:

  1. Be civil: we're here to support and learn from one another. Insults won't be tolerated. Flame wars are frowned upon.

  2. No spam posting.

  3. Posts have to be centered around self-hosting. There are other communities for discussing hardware or home computing. If it's not obvious why your post topic revolves around selfhosting, please include details to make it clear.

  4. Don't duplicate the full text of your blog or github here. Just post the link for folks to click.

  5. Submission headline should match the article title (don’t cherry-pick information from the title to fit your agenda).

  6. No trolling.

Resources:

Any issues on the community? Report it using the report flag.

Questions? DM the mods!

founded 2 years ago
MODERATORS
 

I'm planning to upgrade my home server and need some advice on storage options. I already researched quite a bit and heard so many conflicting opinions and tips.

Sadly, even after asking all those questions to GPT and browsing countless forums, I'm really not sure what I should go with, and need some personal recommendations, experience and tips.

What I want:

  • More storage: Right now, I only have 1 TB, which is just the internal SSD of my thin client. This amount of storage will not be sufficient for personal data anymore in the near future, and it already isn't for my movies.
  • Splitting the data: I want to use the internal drive just for stuff that actively runs, like the host OS, configs and Docker container data. Those are in one single directory and will be backed up manually from time to time. It wouldn't matter that much if they get lost, since I didn't customize a lot and mostly used defaults for everything. The personal data (documents, photos, logs), backups and movies should each get their own partition (or subvolume).
  • Encryption at rest: The personal data are right now unencrypted, and I feel very unwell with that. They definitely have to get encrypted at rest, so that somebody with physical access can't just plug it in and see all my sensitive data in plain text. Backups are already encrypted as is. And for the rest, like movies, astrophotography projects (huge files!), and the host, I absolutely don't care.
  • Extendability: If I notice one day that my storage gets insufficient, I want to just plug in another drive and extend my current space.
  • Redundancy: At least for the most important data, a hard drive failure shouldn't be a mess. I back them up regularly on an external drive (with Borg) and sometimes manually by just copying the files plainly. Right now, the problem is, if the single drive fails, which it might do, it would be very annoying. I wouldn't loose many data, since they all get synced to my devices and I then can just copy them, and I have two offline backups available just in case, but it would still cause quite some headache.

So, here are my questions:

Best option for adding storage

My Mini-PC sadly has no additional ports for more SATA drives. The only option I see is using the 4 USB 3.0 ports on the backside. And there are a few possibilities how I can do that.

  • Option 1: just using "classic" external drives. With that, I could add up to 4 drives. One major drawback of that is the price. Disks with more than 1 TB are very expensive, so I would hit my limit with 4 TB if I don't want to spend a fortune. Also, I'm not sure about the energy supply and stability of the connection. If one drive fails, a big portion of my data is lost too. I can also transform them into a RAID setup, which would half my already limited storage space even more, and then the space wouldn't be enough or extendable anymore. And of course, it would just look very janky too...
  • Option 2: The same as above, but with USB hubs. That way, I theoretically could add up to 20 drives, when I have a hub with 5 slots. That would of course be a very suboptimal thing, because I highly doubt that the single USB port can handle the power demand and information speed/ integrity with that huge amount of drives. In reality, I of course wouldn't add that many. Maybe only two per hub, and then set them up as RAID. That would make 4x2 drives.
  • And, option 3: Buy a specialized hard drive bay, like this simpler one with two slots or this more expensive one for 4 drives and active cooling. With those, I can just plug in up to 4 drives per bay, and then connect those via USB. The drives get their power not from the USB port, but from their own power supply. Also, they get cooled (either passively via the case if I choose one that fits only two drives, or actively with a cooling fan) and there are options to enable different storage modes, for example a built in RAID. That would make the setup quite a bit simpler, but I'm not sure if I would loose control of formatting the drives how I want them to be if they get managed by the bay.

What would you recommend?

File system

File system type

I will probably choose BTRFS if that is possible. I thought about ZFS too, but since it isn't included by default, and BTRFS does everything I want, I will probably go with BTRFS. It would give me the option for subvolumes, some of which are encrypted, compression, deduplication, RAID or merged drives, and seems to be future proof without any disadvantages. My host OS (Debian) is installed with Ext4, because it came like that by default, and is fine for me. But for storage, something else than Ext4 seems to be the superior choice.

Encryption

Encrypting drives with LUKS is relatively straight forward. Are there simple ways to do that, other than via CLI? Do Cockpit, CasaOS or other web interface tools support that? Something similar to Gnomes' Disk Utility for example, where setting that up is just a few clicks.

How can I unlock the drives automatically when certain conditions are met, e.g. when the server is connected to the home network, or by adding a TPM chip onto the mainboard? Unlocking the volume every time the server reboots would be very annoying.

That of course would compromize the security aspect quite a bit, but it doesn't have to be super secure. Just secure enough, that if a malicious actor (e.g. angry Ex-GF, police raid, someone breaking in, etc.) can't see all my photos by just plugging the drive in. For my threat model, everything that takes more than 15 minutes of guessing unlock options is more than enough. I could even choose "Password123" as password, and that would be fine.

I just want the files to be accessible after unlocking, so the "Encrypt after upload"-option that Nextcloud has or Cryptomator for example isn't an option.

RAID?

From what I've read, RAID is a quite controversial topic. Some people say it's not necessary, and some say that one should never live without. I know that it is NOT a backup solution and does not replace proper 3-2-1-backups.

Thing is, I can't assess how often drives fail, and I would loose half of my available storage, which is limited, especially by $$$. For now, I would only add 1 or max 2 TB, and then upgrade later when I really need it. And for that, having to pay 150€ or 400€ is a huge difference.

top 14 comments
sorted by: hot top controversial new old
[–] seaQueue 10 points 4 months ago* (last edited 4 months ago) (1 children)

Buy external drives. Don't run them in RAID, use one to store backups and plug it in once or twice a week to copy data to it.

The secret to RAID is that it doesn't buy you data protection, it buys you uptime to access data while a device in the array is failed. This is most valuable to businesses that can't afford the downtime that recovery from a backup incurs. The most paranoid RAID will still fail sooner or later, due to hardware or software failure, and as a home user with a limited budget you're far better off having one offline backup that you can use to recover data from once that happens.

Backup only data you can't afford to lose (eg: don't backup downloaded data that can be replaced easily, like a game or movie collection) and your backups will be much more manageably sized and you won't need to spend as much on your backup drive. If a backup disk is too much for your budget you can always exploit cloud backup plans, backblaze PC backup has no limit on the size of your backups and only charges something like ~$60/yr.

Edit: It's also worth thinking about what kind of data you're storing and splitting that data across multiple devices if possible. If you're storing bulk data where performance isn't critical, like backups from other machines or a movie collection, you can pay a much lower price by buying a hard drive instead of flash. Even if only some of your data requires fast flash you can still use a cheaper HDD to store bulk data and buy a smaller flash drive for performance sensitive tasks. When I build NAS I split my data two pools, one bulk pool of HDDs and one much smaller fast pool comprised of flash storage. Put performance critical data on flash, put bulk storage on HDDs, this will allow you to spend less on bulk and still have fast storage performance for tasks that require it. A 512GB or 1TB SSD alongside a 4TB, 6TB or 8TB HDD is significantly cheaper than spending on a 4TB or 8TB SSD.

Shop eBay for refurbished storage, it'll be significantly cheaper than spending on brand new drives.

[–] [email protected] 3 points 4 months ago

This is pretty great advice to get into it. I previously ran 3 poweredge 2950s but have since switched to nothing self hosted and back to everything self hosted but on a much leaner setup with a NUC and 14tb WD my book drive with a dual Noctua 4020 fan shroud I 3d printed that it absolutely needed as I killed the original drive in two weeks.

My replica is just a 14tb in my desktop I run rsync to pull the data occasionally after checking SMART status on the primary. It's not versioned or perfect but it works great to give me a chance to backup my jellyfin media. Everything I care about also gets backed up via restic.

Eventually plan to run a build with the Modcase MASS with multiple drives but for now this setup has been working fantastic.

[–] [email protected] 6 points 4 months ago* (last edited 4 months ago)

https://www.linuxserver.io/blog/2017-06-24-the-perfect-media-server-2017

I did perfect media server It's got mergerfs for splitting data and using disks in various sizes .and snapraid for a level of redundancy. Tho raid isn't backup.

That said I'm now running this setup on a n100 machine with a qnap tl-800c jbod USB c box.

Works great for downloads / Plex and home server needs.

The b100 chip isn't amazing... Don't get me wrong but it works really well for Plex.

Hope this all makes sense. I'm on mobile with out my glasses. Lol

[–] InnerScientist 5 points 4 months ago

Drives connected to usb have an unstable connection in my experience, this is very annoying and gets worse with hubs.


RAIDs reduce the time a system is offline and reduce data loss, if a drive fails and you can afford to wait for the new disk and the backup to restore, and have regular backups that ensure no important data gets lost (though remember the data added between backups may be lost) then you don't need a RAID.

I don't use RAIDs cause if my disk fails then I can stomach the 2-4 days it takes to buy a new one and restore the backup

Very important: use S.M.A.R.T and a filesystem with checksums to make sure you're not backing up corrupted data and know to get a new one


For encryption at rest you may want to look at clevis and tang, though you need a server in your home network for this to work. The client (with clevis) then decrypts the disk at boot if it can reach the server (tang). The server can't decrypt the data without the client secret and the client can't decrypt it without the server public key.

Don't know what your server could be though, maybe a router with custom firmware?


You should also look into cloud storage/rclone, that way you can automate your backups more and reduce the need for manual intervention.

I use rclone and restic to automatically backup my servers daily which takes a few seconds most of the time due to them being incremental backups.

[–] thirdBreakfast 5 points 4 months ago

Love the effort you've put into this question. You've clearly done some quality research and thinking.

When I asked myself this same question a couple of years ago, I ended up just buying a second hand Synology NAS to use alongside my mini-pc. That would meet your criteria, and avoids the (I'm not sure what magnitude) reliability risk of using disks connected over USB. It's more proprietary than I'd like, but it's battle tested and reliable for me.

[–] [email protected] 2 points 4 months ago (1 children)

I personally had the best experience with mergerfs (Drives can be any size and can be backuped by snapraid) and an external enclosure up until recently. Unfortunately USB is such a limiting factor because of bandwidth and also latency. I can only realy recommend to get a new cheap Server which has Support for sata If you want any usability while Transferring or moving files.

[–] InnerScientist 1 points 4 months ago

How does mergefs compare to btrfs and bcachefs in using multiple partitions?

[–] [email protected] 2 points 4 months ago

For automatically unlock encrypted drives I followed the approach described in https://michael.stapelberg.ch/posts/2023-10-25-my-all-flash-zfs-network-storage-build/#auto-crypto-unlock

The password is split half in the server itself and half in a file on the web. During boot the server retrieves the second half via http, concatenates the two halves and use the result to unlock the drive. In this way I can always remove the online key and block the automatic decryption.

Another approach that I've considered was to store the decryption keys on a USB drive connected with a long extension cable. The idea is that if someone will steal your server likely won't bother to get the cables too.

TPM is a different beast I didn't study yet, but my understand is that it protects you in case someone steals your drives or tries to read them from another computer. But as long as they are on your server it will always decrypt them automatically. Therefore you delegate the safety of your data to all the software that starts on boot: your photos may still be fully encrypted at rest so a thief cannot get them out from the disk directly, but if you have an open smb share they can just boot your stolen server and get them out from there

[–] solrize 2 points 4 months ago* (last edited 4 months ago) (1 children)

Oh man, what a mess. It is just not worth it if you're only adding 1 or 2 TB. Also you don't say what kind of data you want to store on this system. If it's media files (static once written) that can simplify things.

I'd say don't mess with external drives at all. Your simplest path is upgrade your 1TB internal SSD to 2TB or 4TB. Those aren't too expensive, and you get SSD storage. Yes you may as well use LUKS unless you want to get fancier. I have some thoughts about key management but haven't implemented them in practice, so talk about that would be theoretical.

RAID is for when you have data that changes, like databases where you frequently add rows or do updates, so you are up to date if a drive crashes just after an update. It also lets you keep the system running while you hot swap the crashed drive. If you don't mind taking your storage offline while you restore from a backup, and you don't mind having to recreate the most recent data, you don't need RAID.

I simply keep my static stuff and backups on a Hetzner StorageBox, encrypted with Borg Backup. That eliminates all the hassles of RAID, buying hardware and keeping it at home, etc. I can remote mount it (read only) with sshfs with all cryptography happening on the client side (in practice I don't do that very often). There's no need to use an encrypted file system on the server, or for the server to ever see plaintext. Of course StorageBox is not self hosted, but you could do something similar with a bare iron storage server. Anyway I think it's difficult to beat this for economy until you have tens or maybe 100's of TB of data.

[–] [email protected] 2 points 4 months ago

+1 for borg + hetzner storage box, though externals do give pretty good value for some uses. I have all my movies/tv on a 6tb external and it would have cost so much more to do it any other way

[–] [email protected] 1 points 4 months ago* (last edited 4 months ago)

Acronyms, initialisms, abbreviations, contractions, and other phrases which expand to something larger, that I've seen in this thread:

Fewer Letters More Letters
NAS Network-Attached Storage
NUC Next Unit of Computing brand of Intel small computers
NVMe Non-Volatile Memory Express interface for mass storage
Plex Brand of media server package
RAID Redundant Array of Independent Disks for mass storage
SSD Solid State Drive mass storage

6 acronyms in this thread; the most compressed thread commented on today has 8 acronyms.

[Thread #920 for this sub, first seen 14th Aug 2024, 10:45] [FAQ] [Full list] [Contact] [Source code]

[–] [email protected] 1 points 4 months ago* (last edited 4 months ago)

You've clearly done your homework, and you've gotten a lot of good feedback already, so I'll just add a few points...

  • Storage options: Personally, I'd replace the existing drive with the highest capacity I could afford. In an ideal situation, I'd keep the host on another drive (NVMe or flash) and dedicate the large drive to a single partition of data storage.

    In my own mini-PC (8th gen NUC), I've got a smaller NVMe for Proxmox and a single 8TB internal SSD for data.

  • Encryption: If you're going to bother with encryption, I wouldn't half-ass it. Why bother at all if you're fine using auto-decryption or a weak password that will be guessed with any sizeable effort? Just lock it down with a strong password and decrypt/mount the data drive after any reboot; making a shell alias or script for this is trivial. You're likely not rebooting the server more than once a week anyway.

  • Budget/Specs: I get the sense you don't have much budget right now, but knowing your hardware would help in suggesting solutions. Do you have an NVMe slot? What is the make/model of the motherboard and case?

  • Filesystem: For simple storage, this really doesn't matter and Ext4 will probably be fine. It's a mature, robust, no-frills filesystem which is perfect for bulk file storage (docs, music, videos, etc.), but Btrfs would be fine too if you want more options.

  • USB Docking Stations: I've had really good experiences with USB docking stations like this one, and I currently use it for attaching my backup HDDs each month. I wouldn't want to rely on them for realtime data access, but they do work wonderfully for backups and one-off drive access.

[–] [email protected] 1 points 2 months ago (1 children)

Hi OP, I am in a similar situation as you were so I wonder which solution you chose at the end and if you are happy with it?

[–] [email protected] 2 points 2 months ago

I chose to continue with my current setup until I get the time and motivation to upgrade.

I will build a new server from scratch. For that, I bought an used mainboard for a few bucks, which has 6 SATA slots.