My memory sticks are all DDR4 with 32GB@2133MT/s.
raldone01
Each card has 24GB so 48GB vram total. I use ollama it fills whatever vrams is available on both cards and runs the rest on the CPU cores.
My specs because you asked:
CPU: Intel(R) Xeon(R) E5-2699 v3 (72) @ 3.60 GHz
GPU 1: NVIDIA Tesla P40 [Discrete]
GPU 2: NVIDIA Tesla P40 [Discrete]
GPU 3: Matrox Electronics Systems Ltd. MGA G200EH
Memory: 66.75 GiB / 251.75 GiB (27%)
Swap: 75.50 MiB / 40.00 GiB (0%)
What are you asking exactly?
What do you want to run? I assume you have a 24GB GPU and 64GB host RAM?
I regularly run llama3 70b unqantized on two P40s and CPU at like 7tokens/s. It's usable but not very fast.
True multiple drives speed up reads significantly. As long as the videos are sequential read speeds can be very fast (600MB/s) even on one drive though. Results may vary.
I have a ~40TB HDD array and jellyfin is super fast. Just put the database and cache files on a SSD.
For bulk storage of 4k videos with high bitrates HDDs are way cheaper.
Which os are you running?
Try to partition it with free space at the end and see if it makes a difference.
Try to trim the drive and see if it speeds up again.
Do you use any disk encryption?
Llama3.1 33b would be so cool. It would be a nice middle ground for my machine.
At least on linux rm is very fast
I use tubearchivist. It has a jellyfin addon but it could really use some improvements on how it exposes the videos.
I spam escape but I usally disable sleep on all my machines and use hibernation instead. Too many issues with sleep. Randomly wakes up, USB devices aren't recognized, a monitor stays black...