this post was submitted on 21 Dec 2024
33 points (92.3% liked)

Linux

48836 readers
1681 users here now

From Wikipedia, the free encyclopedia

Linux is a family of open source Unix-like operating systems based on the Linux kernel, an operating system kernel first released on September 17, 1991 by Linus Torvalds. Linux is typically packaged in a Linux distribution (or distro for short).

Distributions include the Linux kernel and supporting system software and libraries, many of which are provided by the GNU Project. Many Linux distributions use the word "Linux" in their name, but the Free Software Foundation uses the name GNU/Linux to emphasize the importance of GNU software, causing some controversy.

Rules

Related Communities

Community icon by Alpár-Etele Méder, licensed under CC BY 3.0

founded 5 years ago
MODERATORS
 

I currently have a 1 TiB NVMe drive that has been hovering at 100 GiB left for the past couple months. I've kept it down by deleting a game every couple weeks, but I would like to play something sometime, and I'm running out of games to delete if I need more space.

That's why I've been thinking about upgrading to a 2 TiB drive, but I just saw an interesting forum thread about LVM cache. The promise of having the storage capacity of an HDD with (usually) the speed of an SSD seems very appealing, but is it actually as good as it seems to be?

And if it is possible, which software should be used? LVM cache seems like a decent option, but I've seen people say it's slow. bcache is also sometimes mentioned, but apparently that one can be unreliable at times.

Beyond that, what method should be used? The Arch Wiki page for bcache mentions several options. Some only seem to cache writes, while some aim to keep the HDD idle as long as possible.

Also, does anyone run a setup like this themselves?

you are viewing a single comment's thread
view the rest of the comments
[–] [email protected] 2 points 1 week ago (1 children)

You can do it but I wouldn't recommend it for your use-case.

Caching is nice but only if the data that you need is actually cached. In the real world, this is unfortunately not always the case:

  1. Data that you haven't used it for a while may be evicted. If you need something infrequently, it'll be extremely slow.
  2. The cache layer doesn't know what is actually important to be cached and cannot make smart decisions; all it sees is IO operations on blocks. Therefore, not all data that is important to cache is actually cached. Block-level caching solutions may only store some data in the cache where they (with their extremely limited view) think it's most beneficial. Bcache for instance skips the cache entirely if writing the data to the cache would be slower than the assumed speed of the backing storage and only caches IO operations below a certain size.

Having data that must be fast always stored on fast storage is the best.

Manually separating data that needs to be fast from data that doesn't is almost always better than relying on dumb caching that cannot know what data is the most beneficial to put or keep in the cache.

This brings us to the question: What are those 900GiB you store on your 1TiB drive?

That would be quite a lot if you only used the machine for regular desktop purposes, so clearly you're storing something else too.

You should look at that data and see what of it actually needs fast access speeds. If you store multimedia files (video, music, pictures etc.), those would be good candidates to instead store on a slower, more cost efficient storage medium.

You mentioned games which can be quite large these days. If you keep currently unplayed games around because you might play them again at some point in the future and don't want to sit through a large download when that point comes, you could also simply create a new games library on the secondary drive and move currently not played but "cached" games into that library. If you need it accessible it's right there immediately (albeit with slower loading times) and you can simply move the game back should you actively play it again.

You could even employ a hybrid approach where you carve out a small portion of your (then much emptier) fast storage to use for caching the slow storage. Just a few dozen GiB of SSD cache can make a huge difference in general HDD usability (e.g. browsing it) and 100-200G could accelerate a good bit of actual data too.

[–] qaz 1 points 1 week ago* (last edited 1 week ago) (1 children)

According to firelight I have 457 GiB in my home directory, 85 GiB of that is games, but I also have several virtual machines which take up about 100 GiB. The / folder contains 38 GiB most of which is due to the nix store (15 GiB) and system libraries (/usr is 22.5 GiB). I made a post about trying to figure out what was taking up storage 9 months ago. It's probably time to try pruning docker again.

EDIT: ncdu says I've stored 129.1 TiB lol

EDIT 2: docker and podman are using about 100 GiB of images.

[–] [email protected] 3 points 1 week ago

I also have several virtual machines which take up about 100 GiB.

This would be the first thing I'd look into getting rid of.

Could these just be containers instead? What are they storing?

nix store (15 GiB)

How large is your (I assume home-manager) closure? If this is 2-3 generations worth, that sounds about right.

system libraries (/usr is 22.5 GiB).

That's extremely large. Like, 2x of what you'd expect a typical system to have.

You should have a look at what's using all that space using your system package manager.

EDIT: ncdu says I've stored 129.1 TiB lol

If you're on btrfs and have a non-trivial subvolume setup, you can't just let ncdu loose on the root subvolume. You need to take a more principled approach.

For assessing your actual working size, you need to ignore snapshots for instance as those are mostly the same extents as your "working set".

You need to keep in mind that snapshots do themselves take up space too though, depending on how much you've deleted or written since taking the snapshot.

btdu is a great tool to analyse space usage of a non-trivial btrfs setup in a probabilistic fashion. It's not available in many distros but you have Nix and we have it of course ;)

Snapshots are the #1 most likely cause for your space usage woes. Any space usage that you cannot explain using your working set is probably caused by them.

Also: Are you using transparent compression? IME it can reduce space usage of data that is similar to typical Nix store contents by about half.