this post was submitted on 30 May 2024
24 points (92.9% liked)

Selfhosted

41441 readers
1095 users here now

A place to share alternatives to popular online services that can be self-hosted without giving up privacy or locking you into a service you don't control.

Rules:

  1. Be civil: we're here to support and learn from one another. Insults won't be tolerated. Flame wars are frowned upon.

  2. No spam posting.

  3. Posts have to be centered around self-hosting. There are other communities for discussing hardware or home computing. If it's not obvious why your post topic revolves around selfhosting, please include details to make it clear.

  4. Don't duplicate the full text of your blog or github here. Just post the link for folks to click.

  5. Submission headline should match the article title (don’t cherry-pick information from the title to fit your agenda).

  6. No trolling.

Resources:

Any issues on the community? Report it using the report flag.

Questions? DM the mods!

founded 2 years ago
MODERATORS
 

I moved off a Synology NAS to a self-managed machine and one thing I still struggle to replace is something like a synology drive. Here are my requirements:

  • server side store data in a plain FS (I want transparency)
  • client side (windows), it must support VFS (download files when needed, support offloading of large files)
  • having snapshots of data is a must

I have a 40gbit uplink to my desktop, so if everything else fails I’ll just use samba with zfs snapshots exposed to VSS, but we’re talking some large files still (think several hundreds of MBs) and I’m not sure Blender will be happy working off a network disk.

I’ve been pointed to next/own-cloud previously, but they don’t seem to cover my use case, I think. Should I actually try one of those? I browsed around owncloud's storage bit (which is written in go), and it seems mostly fitting, but I’ve been told I should steer away from ownCloud towards nextCloud.

you are viewing a single comment's thread
view the rest of the comments
[–] [email protected] 2 points 8 months ago (1 children)

when you said that Nextcloud might not meet your needs, was your concern specifically the server-side data format?

I'd prefer them as plain files. Technically it doesn’t matter much to me if it's a database, if I have to spin up an S3-compatible API, or if I need to slice up a zvol for it, but I just prefer the files because then I can do zfs snapshots (in which I trust) and backup with restic (in which I trust)

[–] computergeek125 2 points 7 months ago

Apologies for being late, I wanted to be as correct as I could be.

So, straight to the point: Nextcloud by default uses plain files if you don't configure the primary storage to be an S3/object store. As far as I can tell, this is not automatic and is an intentional change at system creation by the original admin. There is a third-party migration script, but there does not appear to be a first-party method of converting between the two. That's very good news for you! (I think/hope)

My instance was set up as a standalone, so I cannot speak for the all-in-one image. Poking around the root data directory (datadirectory in the config.php), I was able to locate my user account by internal username - which if you do not use LDAP will be the shortened login name. On default LDAP configs, this internal username may be a GUID, but that can be changed during the LDAP enablement process by overriding the Internal Username field in the Expert LDAP settings.

Once in the user's home folder in the root data directory, my subdirectory options are cache, files, files_trashbin, files_versions, uploads.

  • files contains the "live" structure of how I perceive my Nextcloud home folder in the Web UI and the Nextcloud Desktop sync engine
  • files_trashbin is an unstructured data folder containing every file that was deleted by this user and kept per the trash folder's retention policy (this can be configured at the site level). Files retain their original name, but have a suffix added which takes the form .d######... where the numbers appear to be a Unix timestamp, likely the deletion date. A quick scan of these with the file command in Linux showed that each one had an expected file header based on its extension (i.e., a .png showed as a PNG image with an expected resolution). In the Web UI, there is metadata about which folder the file originally resided in, but I was not able to quickly identify this in the file structure. I believe this info is coming in from the SQL database.
  • files_version are how Nextcloud is storing its file version history (if enabled). Old versions are cleaned up per a set of default behaviors to keep more copies of more recent changes, up to a maximum age deletion threshold set at the site level. This folder is stored in approximately the same structure as the main files live structure, however each copy of each version is appended a suffix .v######... where the number appears to be the Unix timestamp the version was taken (*I have not verified that this exactly matches what the UI shows, nor have I read the source code that generates this). I've spot checked via the Linux file command and sha256 that the files in this versions structure appear to be real data - tested one Excel doc and one plain text doc.

I think that should get a fairly rough answer to your original question, but if I left something out you're curious about, let me know.


Finally, I wanted to thank you for making me actually take a look at how I had decided to configure and back up my Nextcloud instance and ngl it was kind of a mess. The trash bin and versions can both get out of hand if you have frequently changing or deleting/recreating files (I have network synchronization glued onto some of my games that do not have good remote save support). Retention policy on trash and versions cleaned up extraneous data a lot, as only one of those was partially configured.

I can see a lot of room for improvements.... just gotta rip the band-aid off and make intelligent decisions rather than just slapping an rsync job that connects to the Nextcloud instance and replicates down the files and backend database. Not terrible, but not great.

In the backend I'm already using ZFS for my files and Redis database, but my core SQL database was located on the server's root partition (which is XFS - I'd rather not mess with a DKMS module from a boot CD if something happens and upstream borks the compile, which is precisely what happened when I upgraded to OpenZFS 2.1.15).

I do not have automatic ZFS snapshots configured at this time, but based on the above, I'm reasonably confident that I could get data back from a ZFS snapshot if any of the normal guardrails within Nextcloud failed or did not work as intended (trash bin and internal version history). Plus, the data in that cursed rsync backup should be at least 90% functional.