this post was submitted on 07 Dec 2024
12 points (100.0% liked)

Linux

8339 readers
56 users here now

Welcome to c/linux!

Welcome to our thriving Linux community! Whether you're a seasoned Linux enthusiast or just starting your journey, we're excited to have you here. Explore, learn, and collaborate with like-minded individuals who share a passion for open-source software and the endless possibilities it offers. Together, let's dive into the world of Linux and embrace the power of freedom, customization, and innovation. Enjoy your stay and feel free to join the vibrant discussions that await you!

Rules:

  1. Stay on topic: Posts and discussions should be related to Linux, open source software, and related technologies.

  2. Be respectful: Treat fellow community members with respect and courtesy.

  3. Quality over quantity: Share informative and thought-provoking content.

  4. No spam or self-promotion: Avoid excessive self-promotion or spamming.

  5. No NSFW adult content

  6. Follow general lemmy guidelines.

founded 2 years ago
MODERATORS
 

Hello all,

I've posted with this issue before, but I was never able to resolve it. I gave up for a while, but now I'm back and decided to try Mint because of its reputation of being easy to use.

Basically, across Ubuntu, Arch, Manjaro, Debian, Fedora, and now Mint I was only able to get the proprietary NVIDIA drivers to work for a few hours after which I start to experience freezes and crashes more and more frequently (often before or right after the log in screen). Bazzite worked for the longest period of time, but once the crashes and freezes started, it didn't stop no matter what I did, even a fresh install with a brand new usb.

What typically happens is everything but the mouse will either get super choppy then freeze or just outright freeze, often with a bunch of artifacting that looks like either X's or space invaders all over either a window or the whole screen. It always requires a hard restart and the longer I wait before turning it back on, the longer I tend to have before it does it again.

Switching back to nouveau (when I was able to) worked on Debian, Fedora, and Mint, but PoE2 in early access and Shadow of the Erdtree on my backlog I really would like to be able to play video games, so it's not immensely helpful.

So far with Mint I have tried all three drivers in the manager (550, 535, and 470). 470 didn't seem to work and 535 and 550 both worked for a few minutes before starting to freeze and crash. I also tried the 550 open drivers, which gave me the phantom monitor issue and refused to pick up my second monitor at all. I blacklisted nouveau, which made the phantom monitor issue persist with the normal 550 driver, too and didn't fix the main issue anyway. I am now back at square one with nouveau again.

System:
  Kernel: 6.8.0-49-generic arch: x86_64 bits: 64 compiler: gcc v: 13.2.0 clocksource: tsc
  Desktop: Cinnamon v: 6.2.9 tk: GTK v: 3.24.41 wm: Muffin v: 6.2.0 vt: 7 dm: LightDM v: 1.30.0
    Distro: Linux Mint 22 Wilma base: Ubuntu 24.04 noble
Machine:
  Type: Desktop Mobo: Micro-Star model: X570-A PRO (MS-7C37) v: 3.0 serial: <superuser required>
    uuid: <superuser required> UEFI: American Megatrends v: H.40 date: 09/10/2019
CPU:
  Info: 12-core model: AMD Ryzen 9 3900X bits: 64 type: MT MCP smt: enabled arch: Zen 2 rev: 0
    cache: L1: 768 KiB L2: 6 MiB L3: 64 MiB
  Speed (MHz): avg: 3782 high: 4300 min/max: N/A cores: 1: 3659 2: 3800 3: 3847 4: 3599 5: 3602
    6: 3600 7: 4280 8: 4260 9: 3426 10: 3427 11: 3600 12: 3800 13: 3800 14: 3600 15: 3876 16: 3598
    17: 3800 18: 3800 19: 4300 20: 4300 21: 3542 22: 3800 23: 4049 24: 3412 bogomips: 182404
  Flags: avx avx2 ht lm nx pae sse sse2 sse3 sse4_1 sse4_2 sse4a ssse3
Graphics:
  Device-1: NVIDIA GP102 [GeForce GTX 1080 Ti] vendor: Micro-Star MSI driver: nouveau v: kernel
    arch: Pascal pcie: speed: 2.5 GT/s lanes: 16 ports: active: HDMI-A-1,HDMI-A-2
    empty: DP-1,DP-2,DVI-D-1 bus-ID: 2d:00.0 chip-ID: 10de:1b06 class-ID: 0300 temp: 40.0 C
  Display: x11 server: X.Org v: 21.1.11 with: Xwayland v: 23.2.6 driver: X: loaded: modesetting
    unloaded: fbdev,vesa dri: nouveau gpu: nouveau display-ID: :0 screens: 1
  Screen-1: 0 s-res: 5280x2560 s-dpi: 96 s-size: 1397x677mm (55.00x26.65")
    s-diag: 1552mm (61.12")
  Monitor-1: HDMI-A-1 mapped: HDMI-1 pos: primary,bottom-r model: BenQ PD3200U serial: <filter>
    res: 3840x2160 hz: 60 dpi: 138 size: 708x399mm (27.87x15.71") diag: 806mm (31.7") modes:
    max: 3840x2160 min: 720x400
  Monitor-2: HDMI-A-2 mapped: HDMI-2 pos: top-left model: BenQ PD2500Q serial: <filter>
    res: 1440x2560 hz: 60 dpi: 118 size: 311x553mm (12.24x21.77") diag: 634mm (25") modes:
    max: 2560x1440 min: 720x400
  API: EGL v: 1.5 hw: drv: nvidia nouveau platforms: device: 0 drv: nouveau device: 1 drv: swrast
    gbm: drv: nouveau surfaceless: drv: nouveau x11: drv: nouveau inactive: wayland
  API: OpenGL v: 4.5 compat-v: 4.3 vendor: mesa v: 24.0.9-0ubuntu0.2 glx-v: 1.4
    direct-render: yes renderer: NV132 device-ID: 10de:1b06
Audio:
  Device-1: NVIDIA GP102 HDMI Audio vendor: Micro-Star MSI driver: snd_hda_intel v: kernel pcie:
    speed: 2.5 GT/s lanes: 16 bus-ID: 2d:00.1 chip-ID: 10de:10ef class-ID: 0403
  Device-2: AMD Starship/Matisse HD Audio vendor: Micro-Star MSI X570-A PRO driver: snd_hda_intel
    v: kernel pcie: speed: 16 GT/s lanes: 16 bus-ID: 2f:00.4 chip-ID: 1022:1487 class-ID: 0403
  API: ALSA v: k6.8.0-49-generic status: kernel-api
  Server-1: PipeWire v: 1.0.5 status: active with: 1: pipewire-pulse status: active
    2: wireplumber status: active 3: pipewire-alsa type: plugin
Network:
  Device-1: Realtek RTL8111/8168/8211/8411 PCI Express Gigabit Ethernet vendor: Micro-Star MSI
    X570-A PRO driver: r8169 v: kernel pcie: speed: 2.5 GT/s lanes: 1 port: d000 bus-ID: 27:00.0
    chip-ID: 10ec:8168 class-ID: 0200
  IF: enp39s0 state: up speed: 1000 Mbps duplex: full mac: <filter>
  Device-2: Intel Wi-Fi 6 AX200 driver: iwlwifi v: kernel pcie: speed: 5 GT/s lanes: 1
    bus-ID: 29:00.0 chip-ID: 8086:2723 class-ID: 0280
  IF: wlp41s0 state: down mac: <filter>
Bluetooth:
  Device-1: Intel AX200 Bluetooth driver: btusb v: 0.8 type: USB rev: 2.0 speed: 12 Mb/s lanes: 1
    bus-ID: 3-6.3:5 chip-ID: 8087:0029 class-ID: e001
  Report: hciconfig ID: hci0 rfk-id: 0 state: up address: <filter> bt-v: 5.2 lmp-v: 11
    sub-v: 2184 hci-v: 11 rev: 2184 class-ID: 7c0104
Drives:
  Local Storage: total: 4.32 TiB used: 240.68 GiB (5.4%)
  ID-1: /dev/sda vendor: Samsung model: SSD 860 EVO 250GB size: 232.89 GiB speed: 6.0 Gb/s
    tech: SSD serial: <filter> fw-rev: 1B6Q scheme: GPT
  ID-2: /dev/sdb vendor: Toshiba model: HDWE140 size: 3.64 TiB speed: 6.0 Gb/s tech: HDD
    rpm: 7200 serial: <filter> fw-rev: FP2A scheme: GPT
  ID-3: /dev/sdc vendor: Western Digital model: WDS500G2B0B-00YS70 size: 465.76 GiB
    speed: 6.0 Gb/s tech: SSD serial: <filter> fw-rev: 90WD scheme: GPT
Partition:
  ID-1: / size: 227.68 GiB used: 54.89 GiB (24.1%) fs: ext4 dev: /dev/sda2
  ID-2: /boot/efi size: 511 MiB used: 6.1 MiB (1.2%) fs: vfat dev: /dev/sda1
Swap:
  ID-1: swap-1 type: file size: 2 GiB used: 0 KiB (0.0%) priority: -2 file: /swapfile
USB:
  Hub-1: 1-0:1 info: hi-speed hub with single TT ports: 6 rev: 2.0 speed: 480 Mb/s lanes: 1
    chip-ID: 1d6b:0002 class-ID: 0900
  Hub-2: 2-0:1 info: super-speed hub ports: 4 rev: 3.1 speed: 10 Gb/s lanes: 1 chip-ID: 1d6b:0003
    class-ID: 0900
  Hub-3: 3-0:1 info: hi-speed hub with single TT ports: 6 rev: 2.0 speed: 480 Mb/s lanes: 1
    chip-ID: 1d6b:0002 class-ID: 0900
  Device-1: 3-5:2 info: Micro Star MYSTIC LIGHT type: HID driver: hid-generic,usbhid
    interfaces: 1 rev: 1.1 speed: 12 Mb/s lanes: 1 power: 500mA chip-ID: 1462:7c37 class-ID: 0300
    serial: <filter>
  Hub-4: 3-6:3 info: Genesys Logic Hub ports: 4 rev: 2.0 speed: 480 Mb/s lanes: 1 power: 100mA
    chip-ID: 05e3:0608 class-ID: 0900
  Device-1: 3-6.3:5 info: Intel AX200 Bluetooth type: bluetooth driver: btusb interfaces: 2
    rev: 2.0 speed: 12 Mb/s lanes: 1 power: 100mA chip-ID: 8087:0029 class-ID: e001
  Hub-5: 4-0:1 info: super-speed hub ports: 4 rev: 3.1 speed: 10 Gb/s lanes: 1 chip-ID: 1d6b:0003
    class-ID: 0900
  Hub-6: 5-0:1 info: hi-speed hub with single TT ports: 4 rev: 2.0 speed: 480 Mb/s lanes: 1
    chip-ID: 1d6b:0002 class-ID: 0900
  Device-1: 5-3:2 info: PloopyCo Mouse type: keyboard,HID driver: hid-generic,usbhid
    interfaces: 2 rev: 1.1 speed: 12 Mb/s lanes: 1 power: 100mA chip-ID: 5043:4d6f class-ID: 0300
  Device-2: 5-4:3 info: Glorious GMMK Pro type: keyboard,HID driver: hid-generic,usbhid
    interfaces: 2 rev: 1.1 speed: 12 Mb/s lanes: 1 power: 500mA chip-ID: 320f:5044 class-ID: 0300
  Hub-7: 6-0:1 info: super-speed hub ports: 4 rev: 3.1 speed: 10 Gb/s lanes: 1 chip-ID: 1d6b:0003
    class-ID: 0900
Sensors:
  System Temperatures: cpu: 56.5 C mobo: N/A gpu: nouveau temp: 40.0 C
  Fan Speeds (rpm): N/A gpu: nouveau fan: 0
Repos:
  Packages: 2586 pm: dpkg pkgs: 2568 pm: flatpak pkgs: 18
  No active apt repos in: /etc/apt/sources.list
  Active apt repos in: /etc/apt/sources.list.d/official-package-repositories.list
    1: deb http: //packages.linuxmint.com wilma main upstream import backport
    2: deb http: //archive.ubuntu.com/ubuntu noble main restricted universe multiverse
    3: deb http: //archive.ubuntu.com/ubuntu noble-updates main restricted universe multiverse
    4: deb http: //archive.ubuntu.com/ubuntu noble-backports main restricted universe multiverse
    5: deb http: //security.ubuntu.com/ubuntu/ noble-security main restricted universe multiverse
  Active apt repos in: /etc/apt/sources.list.d/spotify.list
    1: deb http: //repository.spotify.com stable non-free
Info:
  Memory: total: 32 GiB available: 31.28 GiB used: 4.28 GiB (13.7%)
  Processes: 602 Power: uptime: 3h 4m states: freeze,mem,disk suspend: deep wakeups: 0
    hibernate: platform Init: systemd v: 255 target: graphical (5) default: graphical
  Compilers: gcc: 13.2.0 Client: Unknown python3.12 client inxi: 3.3.34
top 14 comments
sorted by: hot top controversial new old
[–] RedWeasel 5 points 3 weeks ago (2 children)

If I am understanding that output, your bios is 5 years old and there are a LOT of updates since. If that is correct I'd try updating as bios updates fix things like PCIe compatibility which were released after that date.

[–] mushroomstormtrooper 2 points 3 weeks ago* (last edited 3 weeks ago) (1 children)

I had some time and updated my bios, then switched over to the 550 driver from the driver manager and upon restart I had the phantom monitor issue which seems to have been fixed by following this post. its only been a few minutes and I haven't done anything with it yet, so I will update if any more issues come up but I'm feeling hopeful.

Edit: literally the second after posting this reply I got a freeze trying to open steam...

[–] RedWeasel 2 points 3 weeks ago (2 children)

I can't tell from that output if you are using wayland or X11. If it is wayland I'd try X11 as the drivers before 555(?) don't work as well and some compositors may be more unstable with the nvidia drivers.

I'd also try checking the logs as well. 'journalctl -b -1 -xe' to show the prior boot.

These is my only other suggestions.

[–] mushroomstormtrooper 1 points 3 weeks ago (1 children)

I managed to check the output of that command and switch back to nouveau before it froze this time. The mouse is still operable but you can see the artifacting I've been getting. It looks like I am on x11 as my three options at start are Cinnamon (Default), Cinnamon (Software Rendering), and Cinnamon on Wayland (Experimental). I am about to try Wayland just to see.

[–] RedWeasel 2 points 3 weeks ago (1 children)

On top of the thermal paste idea, try doing a ram test. Could also try adding "nvidia_drm.modeset=0" to your kernel commandline or add "options nvidia-drm modeset=0" in /etc/modprobe.d/nvidia.conf. It might have that set to 1 somewhere.

There may be more in the logs as well. Are you able to test in windows? Maybe the gpu is failing as well.

[–] mushroomstormtrooper 1 points 2 weeks ago* (last edited 2 weeks ago)

I reinstalled the 550 driver and added that line in nvidia.conf. I will restart and see how it goes. I suppose I could set up a temporary dual boot and test it in windows. This all started as soon as I first started trying to get linux working on this pc, though. Prior to that it had been running windows since I got it. The temperature readings all seem fine, too. That being said I think the next thing I try is to replace the thermal paste and if that doesn't work I'll dig up my windows installation media and see how that goes. Someone in a previous thread suggested my psu might be going out, which would make sense because its the only original part other than the cd drive at this point, so I think that might be my next try if nothing free works.

Edit: I got a black screen on the first restart, got to log in and then went to a black screen with a working cursor on the second and third restarts, then it seemed to be working fine, until mostly everything went black except some of spotify and keepass and white squares were flashing everywhere for about 30 seconds and then it stopped. Everything is working as normal now for the time being.

Edit 2: Just played Elden ring for a few minutes, the game crashed pretty quickly, but not immediately and nothing else crashed. Still planning to do the memtest at next full crash.

Edit3: had a hard freeze and ran the memtest when I had to restart. It passed.

[–] mushroomstormtrooper 1 points 3 weeks ago

I'll have some time in the morning to mess with it and I will post those results.

[–] mushroomstormtrooper 2 points 3 weeks ago

I'll give that a shot this weekend and see if that ends up fixing the issue! Thanks!

[–] sylver_dragon 3 points 3 weeks ago (1 children)

I noticed you have an MSI motherboard. Do you have XMP enabled in the BIOS? Sometime after I switched to Arch, I did a driver update (NVIDIA 3080) and started getting random freezes. The whole system would lock up and I would have to hold the power button down to force a power off. I went crazy trying to find any sort of fix and decided to turn off XMP out of desperation. I haven't had a freeze since. Maybe it's not related to your problem, but it seems that something about the newer NVidia drivers just didn't like what MSI was doing.

[–] mushroomstormtrooper 1 points 3 weeks ago (1 children)

I just went in and checked and it looks like it is already off. Out of curiosity, I tried turning on profile one (froze at login, which is what happened after the most recent reboot) and two (froze on black screen before login).

[–] sylver_dragon 3 points 3 weeks ago (1 children)

Had another thought late last night. Have you looked at the thermal paste on your GPU? I'm guessing that 1080 you have has a few miles on it and is also MSI. They went through a period where they were using some pretty terrible thermal paste on their cards, which tended to dry up and stop working well after a few years. My wife had an RX980 which this happened to and the main symptom was random freezes under load. Perhaps the new driver is running it just hot enough at idle to cause problems.

[–] mushroomstormtrooper 1 points 3 weeks ago

I haven't checked the thermal paste, I think I have some lying around. I have periodically been checking the temperature at 2 second intervals in the terminal off to the side and it seems to freeze up regardless of what temp it was at. I will open up and replace it because it couldn't hurt. I will update if that helps.

[–] [email protected] 2 points 2 weeks ago (1 children)

I had a similar symptom recently but with an AMD system. I realized pretty quickly that my system simply forgot I had ram installed. I took out the ram sticks and swapped them, everything was fine afterwards.

Maybe try running a memtest just to be sure.

[–] mushroomstormtrooper 1 points 2 weeks ago* (last edited 2 weeks ago)

I made a memtest drive and will run it as soon as I need to restart again, which might be soon because I'm starting to see more artifacting again.

Edit: Had a hard freeze and had to restart. The memtest passed with no errors.