Not sure if this is better fit for datahoarder or some selfhost community, but putting my money on this one.
The problem
I currently have a cute little server with two drives connected to it running a few different services (mostly media serving and torrents). The key facts here is that 1) it's cute and little, 2) it's handling pretty bulky data. Cute and little doesn't go very well with big raid setups and such, and apart from upgrading one of the drives I'm probably at my limit in terms of how much storage I can physically fit in the machine. Also if I want to reinstall it or something that's very difficult to do without downtime since I'd have to move the drive and services of to a different machine (not a huge problem since I'm the only one using it, but I don't like it).
Solution
A distributed FS would definitely solve the issue of physically fitting more drives into the chassi, since I could basically just connect drives to a raspberry pi and have this raspi join the distributed fs. Great.
I think it could also solve the issue of potential downtime if I reinstall or do maintenance, since I can have multiple services read of the same distributed FS and reroute my reverse proxy to use the new services while the old ones are taken offline. There will potentially be a disruption, but no downtime.
Candidates
I know there are many different solutions for distributed filesystems, such as ceph, moosefs, glusterfs and miniio. I'm kinda leaning towards ceph because of it's integration in proxmox, but it also seems like the most complicated solution in the bunch. Is it worth it? What are your experiences with these, and given the above description of my use-case which do you think would be the best fit?
Since I already have a lot of data it's a bonus if it's easy to migrate from my current filesystem somehow.
My current setup uses a lot of hard links as well, so it's a big bonus if the solution has something similar (i.e. some easy way of storing the same data in multiple places without duplicating it)
I think you're spot on with LLMs being mostly trained on these kinds of tasks. Can't say I'm an expert in how to build a training set, but I imagine it's quite easy to do with these kinds of problems because it's easy to classify a solution as correct or incorrect. This is in contrast to larger problems which are less guided by algorithmic efficiency and more by sound design/architecture.
Still, I think it's quite impressive. You don't have to go very far back in time to have top of the line LLMs unable to solve these kinds of problems.
Usually with AoC part 1 is brute-forceable, but part 2 is not. Very often part 1 is to find the 100th number, and part 2 is to find the 1 000 000 000 000th number or something. Last year, out of curiosity, I had a brute-force solution for one problem that successfully completed on ~90% of the input. Solution was multi-threaded and running on a 16 core CPU for about 20 days before I gave up. But the LLMs this year (not sure if this was a problem last year) are in the top list of fastest users to solve the problems.