this post was submitted on 19 May 2024
67 points (95.9% liked)

Linux

48822 readers
748 users here now

From Wikipedia, the free encyclopedia

Linux is a family of open source Unix-like operating systems based on the Linux kernel, an operating system kernel first released on September 17, 1991 by Linus Torvalds. Linux is typically packaged in a Linux distribution (or distro for short).

Distributions include the Linux kernel and supporting system software and libraries, many of which are provided by the GNU Project. Many Linux distributions use the word "Linux" in their name, but the Free Software Foundation uses the name GNU/Linux to emphasize the importance of GNU software, causing some controversy.

Rules

Related Communities

Community icon by Alpár-Etele Méder, licensed under CC BY 3.0

founded 5 years ago
MODERATORS
 

Title is TLDR. More info about what I'm trying to do below.

My daily driver computer is Laptop with an SSD. No possibility to expand.

So for storage of lots n lots of files, I have an old, low resource Desktop with a bunch of HDDs plugged in (mostly via USB).

I can access Desktop files via SSH/SFTP on the LAN. But it can be quite slow.

And sometimes (not too often; this isn't a main requirement) I take Laptop to use elsewhere. I do not plan to make Desktop available outside the network so I need to have a copy of required files on Laptop.

Therefor, sometimes I like to move the remote files from Desktop to Laptop to work on them. To make a sort of local cache. This could be individual files or directory trees.

But then I have a mess of duplication. Sometimes I forget to put the files back.

Seems like Laptop could be a lot more clever than I am and help with this. Like could it always fetch a remote file which is being edited and save it locally?

Is there any way to have Laptop fetch files, information about file trees, etc, located on Desktop when needed and smartly put them back after editing?

Or even keep some stuff around. Like lists of files, attributes, thumbnails etc. Even browsing the directory tree on Desktop can be slow sometimes.

I am not sure what this would be called.

Ideas and tools I am already comfortable with:

  • rsync is the most obvious foundation to work from but I am not sure exactly what would be the best configuration and how to manage it.

  • luckybackup is my favorite rsync GUI front end; it lets you save profiles, jobs etc which is sweet

  • freeFileSync is another GUI front end I've used but I am preferring lucky/rsync these days

  • I don't think git is a viable solution here because there are already git directories included, there are many non-text files, and some of the directory trees are so large that they would cause git to choke looking at all the files.

  • syncthing might work. I've been having issues with it lately but I may have gotten these ironed out.

Something a little more transparent than the above would be cool but I am not sure if that exists?

Any help appreciated even just idea on what to web search for because I am stumped even on that.

you are viewing a single comment's thread
view the rest of the comments
[–] [email protected] 4 points 7 months ago (1 children)

NFS and ZeroTier would likely work.

When at home NFS will be similar to a local drive, though a but slower. Faster than SSHFS. NFS is often used to expand limited local space.

I expect a cache layer on NFS is simple enough, but that is outside my experience.

The issue with syncing, is usually needing to sync everything.

[–] [email protected] 1 points 7 months ago (4 children)

What would be the role of Zerotier? It seems like some sort of VPN-type application. I don't understand what it's needed for though. Someone else also suggested it albeit in a different configuration.

Just doing some reading on NFS, it certainly seems promising. Naturally ArchWiki has a fairly clear instruction document. But I am having a ahrd time seeing what it is exactly? Why is it faster than SSHFS?

Using the Cache with NFS > Cache Limitations with NFS:

Opening a file from a shared file system for direct I/O automatically bypasses the cache. This is because this type of access must be direct to the server.

Which raises the question what is "direct I/O" and is it something I use? This page calls direct I/O "an alternative caching policy" and the limited amount I can understand elsewhere leads me to infer I don't need to worry about this. Does anyone know otherwise?

The issue with syncing, is usually needing to sync everything.

yes this is why syncthing proved difficult when I last tried it for this purpose.

Beyond the actual files ti would be really handy if some lower-level stuff could be cache/synced between devices. Like thumbnails and other metadata. To my mind, remotely perusing Desktop filesystem from Laptop should be just as fast as looking through local files. I wouldn't mind having a reasonable chunk of local storage dedicated to keeping this available.

[–] [email protected] 2 points 7 months ago

If there is sufficient RAM on the laptop, Linux will cache a lot of metadata in other cache layers without NFS-Cache.

[–] [email protected] 2 points 7 months ago

ZeroTier allows for a mobile, LAN-like experience. If the laptop is at a café, the files can be accessed as if at home, within network performance limits.

[–] [email protected] 2 points 7 months ago

NFS-Cache is a specific cache for NFS, and does not represent all caching that can be done of files over NFS. "Direct I/O" is also a specific thing, and should not be generalized in the meanings of "direct" and "I/O".

Let's skip those entirely for now as I cannot simply explain either. I doubt either will matter in your use case, but look back if performance lags.

One laptop accessing one NFS share will have good performance on a quite local network.

NFS is an old protocol that is robust and used frequently. NFSv3 is not encrypted. NFSv4 has support for encryption. (ZeroTier can handle the encryption.)

SSHFS is a pseudo file system layered over SSH. SSH handles encryption. SSHFS is maybe 15 years old and is aimed at convenience. SSH is largely aimed at moving streams of text between two points securely. Maybe it is faster now than it was.

[–] [email protected] 2 points 7 months ago (1 children)

NFS is generally the way network storage appliances are accessed on Linux. If you're using a computer you know you're going to be accessing files on in the long term it's generally the way to go since it's a simple, robust, high performance protocol that's used by pros and amateurs alike. SSHFS is an abuse of the ssh protocol that allows you to mount a directory on any computer you can get an ssh connection to. You can think of it like VSCode remote editing, but it'll work with any editor or other program.

You should be able to set up NFS with write caching, etc that will allow it to be more similar in performance to a local filesystem. Note that you may not want write caching specifically if you're going to suddenly disconnect your laptop from the network without unmounting the share first. Your actual performance might not be the same, especially for large transfers, due to the throughput of your network and connection quality. In my general experience sshfs is kind of slow especially when accessing many different small files, and NFS is usually much faster.

[–] [email protected] 1 points 7 months ago

Thanks this comment is v helpful. A persuasive argument for NFS and against sshfs!