this post was submitted on 26 Jul 2024
311 points (98.1% liked)

Technology

59986 readers
2834 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related content.
  3. Be excellent to each another!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, to ask if your bot can be added please contact us.
  9. Check for duplicates before posting, duplicates may be removed

Approved Bots


founded 2 years ago
MODERATORS
top 50 comments
sorted by: hot top controversial new old
[–] MeatStiq 63 points 4 months ago (9 children)
[–] [email protected] 16 points 4 months ago* (last edited 4 months ago)

Damn I just upgraded to socket AM5 too, great decision. It almost was Intel.

[–] Alchalide 2 points 4 months ago

Same and also went with Radeon graphics this time. Very happy with my decision.

load more comments (7 replies)
[–] ShittyBeatlesFCPres 38 points 4 months ago (2 children)

And people laughed at me for sticking with my MOS 6502. Who’s laughing now?

[–] [email protected] 17 points 4 months ago

Did you write a TCP/IP stack and web browser in BASIC?

[–] [email protected] 13 points 4 months ago (1 children)

"Pure, passive cooling. No fans or moving parts. Will be working a century from now."

[–] [email protected] 18 points 4 months ago

Will still be working on the same problem a century from now.

[–] [email protected] 29 points 4 months ago (4 children)

I recently built a 12th Gen PC, expecting an upgrade to 13th Gen will be a cheap and significant upgrade path soon. Now there isn't going to be any way to know if a second-hand CPU is damaged in this way.

[–] [email protected] 24 points 4 months ago* (last edited 4 months ago) (2 children)

It can get a whole lot worse.

I bought a $500 13th gen CPU that destroyed itself, replaced it (and didn't keep the dead CPU) with a $500 14th gen CPU that destroyed itself, and spent another ~$500 on related hardware and dumping Intel stuff to go AMD to get a working system. I also spent a lot of time trying to resolve the problem. I'd bet that I'm not the person burned worst, because someone could very easily have replaced their motherboard or memory or power supply unit in the hopes of fixing the issue, as any of these could have looked like potential causes, and there'd be no way for anyone to prove to Intel that this was the cause even if Intel intended to reimburse for these.

Maybe, I might get $500 back at most if Intel reimburses for the 14th gen CPU; I'd assume that at best, based on what they've been doing so far, that they'd send out another Intel CPU (which I no longer have a use for, having gone AMD).

And I was mostly using this system for fun. While I was corrupting my root filesystem regularly at boot at the end, I ultimately didn't -- as far as I know -- suffer any serious data loss or expense from the data that the processor was corrupting. My system was mostly to be used for my own entertainment. I didn't miss deadlines or lose critical information.

As Steve Burke has pointed out in earlier episodes on this, there are people who have been impacted by those secondary costs, some of which might make my own costs look irrelevant.

He was talking to video game companies who were using affected processors as well as having customers who were affected; they had apparently banned some customers for cheating because they knew that the internal state of the game was incorrect; they couldn't figure out what the customers were doing, but knew that their game state was being modified. It apparently wasn't the customers cheating, but their CPU, which had partially destroyed itself, and was now corrupting memory.

Another had been using CPUs for video game servers and those kept dying and taking down service; another company estimated that they'd lost $100k in player business due to the problem.

Apparently these were also popular, due to high single-threaded performance, with hedge funds that do stock trading. I imagine that a system that suddenly stops working or corrupts data can very quickly become extremely expensive in that context, far in excess of what the CPUs cost.

OEMs who build and sold systems containing these CPUs had apparently been taking back systems and repeatedly replacing parts; they probably incurred substantial costs and hits to their own reputation, as customers are upset with them.

Same thing with datacenter providers, who incurred a lot of costs investigating and mitigating problems, swapping parts and CPUs. One of these Burke quoted as having advised customers to use an alternate AMD-based system and if they insisted on the Intel one, the provider would charge a $1000 additional service fee to cover all the costs the provider was taking in having to deal with systems based on the CPUs. Gives an idea of what they were losing.

God only knows what the impact of having a ton of data around the world corrupted is. Probably no more than a tiny fraction of the problems related to corruption will ever actually be attributed to the CPUs themselves.

And I don't know how many systems out there may not be fully-tracked -- so they don't get updates to avoid the problem -- and have the CPUs built into them. Industrial automation hardware? Ship navigation systems? Who knows? All kinds of things that might fail in absolutely spectacular ways if they work for a period of time, then down the road, eventually start corrupting data more and more severely.

I mean, Intel might, at best, provide a cash refund for a dead CPU. But they aren't gonna cover losses from secondary problems, and there's no realistic way that most businesses and people who bought these could prove them, anyway.

Buying the last CPU they made before this clusterfuck occurred is maybe one of the best things you could have done and still be indirectly affected, as you got a reasonably fast system that wasn't directly affected -- if I'd known about this in advance, rather then Intel not saying anything, I'd have purchased a 12th gen CPU happily rather than another $1k in useless hardware and spent a ton of time to try to resolve my problems. You'll have the option to, at upgrade time, go AMD or 15th gen Intel and LGA 1851, if you want to hope that Intel's 15th gen is more solid than their previous two. Just means a new motherboard and, if you're using DDR4 memory, you'll need to toss that and buy DDR5.

[–] systemglitch 10 points 4 months ago (1 children)

Anyone who knows of this and buys 15th gen over AMD is a fool imo. The risk is just so high and AMD has become so solid in the last decade.

[–] [email protected] 3 points 4 months ago

Sure would be nice if you could disable the god damn PSP though. sigh companies just CAN NOT not spy on you. Its insane.

[–] [email protected] 2 points 4 months ago (1 children)

I would have gone AMD in the first place if this happened at the time of my purchase.

Oh well. Upgrade time is going to be a long way away. My last gaming PC served me well for almost 10 years before I did an in socket upgrade.

[–] [email protected] 3 points 4 months ago

I would have gone AMD in the first place if this happened at the time of my purchase.

Well, you've got better judgement than me. l'd been running just Intel for ~25 years and was comfortable with them, and even when ordering the replacement, still wasn't absolutely certain that the CPU was at fault until the replacement (temporarily, for a few months) resolved all the problems.

Moving forward, I expect I'll use AMD unless they manage to do something like this.

My last gaming PC served me well for almost 10 years before I did an in socket upgrade.

Yeah, not a lot of annual single-threaded performance improvements since the early 2000s. Can very easily use older CPUs just fine for a long time these days, depending upon workload.

[–] [email protected] 20 points 4 months ago (1 children)

Ebay is probably going to be flooded with damaged Intel CPUs.

[–] [email protected] 6 points 4 months ago

Not to mention PCs with them.

[–] astanix 11 points 4 months ago (1 children)

Used to be that a CPU was the safest used pc part to buy... not anymore :/

[–] [email protected] 3 points 4 months ago

That's why I went this route 🫤

load more comments (1 replies)
[–] [email protected] 17 points 4 months ago* (last edited 4 months ago) (3 children)

If your CPU is crashing/unstable then yes, damage is already done, but for the few of us who bought these later just update your bios to the latest one, set intel defaults, do not overclock (I have even undervolted it a bit, but ymmv) and wait for the microcode update.

Though I do wonder if Intel isn't just stalling for time, I do hope they are not. Didn't wanna touch my build for next ~5 years.

[–] [email protected] 23 points 4 months ago* (last edited 4 months ago)

That is, disappointingly, not sufficient to guarantee avoiding damage. I set all that in the BIOS using my first processor (13900KF) before ever inserting my replacement processor (14900KF) into the motherboard. The replacement processor still destroyed itself.

Processor 1 used only motherboard defaults and managed to destroy itself.

Processor 2 used only Intel recommended settings, no XMP memory profile, no Intel turbo boost, more conservative than motherboard defaults, and also destroyed itself.

I did not try running a processor for its lifetime at minimum memory speed or with only 1 core active. It's possible that that might be sufficient to avoid damage. If I hadn't already gone AMD over this, and had to use a processor from the affected generations, that's what I'd be doing now until Intel comes out with their update. Not gonna do much by way of fancy gaming, but at least the system's usable and hopefully won't destroy itself.

[–] systemglitch 8 points 4 months ago

Rather get Ryzen and not deal with this.

[–] [email protected] 4 points 4 months ago

There is also a corrosion issue. No software update will fix that. Intel purposely misled the media on that.

[–] [email protected] 15 points 4 months ago

Real shit, can you sue them for this? I mean they aren't stopping selling them even knowing they are faulty. Seriously, how can you get your money back from these vultures.

[–] [email protected] 11 points 4 months ago (4 children)

Any solutions for avoiding the damage if you happen to get a new one?

[–] kvasir476 10 points 4 months ago (1 children)

What, if anything, can customers do to slow or stop degradation ahead of the microcode update?

Intel recommends that users adhere to Intel Default Settings on their desktop processors, along with ensuring their BIOS is up to date. Once the microcode patch is released to Intel partners, we advise users check for the relevant BIOS updates.

[–] [email protected] 6 points 4 months ago* (last edited 4 months ago) (1 children)

I destroyed my second CPU, a 14900KF, while having already been aware of that recommendation, and having disabled all of the settings like that that the motherboard vendor had enabled by default prior to ever inserting the replacement CPU, and only used the CPU with those settings; it still destroyed itself, like the first. I am very confident that you can still destroy a CPU having done that.

That isn't to say that using conservative settings is a bad idea (and maybe doing something further, like running memory at minimum frequency, not just using the Intel recommended default rather than the motherboard vendor defaults, might actually manage to reliably avoid CPU damage). But I am confident that just running standard Intel recommended settings is not, alone, enough to avoid damage.

[–] kvasir476 2 points 4 months ago

Completely agree, that was just a quote ripped straight from the article. From everything I've heard it seems like people are having problems just running stock settings. Your best bet to absolutely avoid any damage is probably to literally shut your system down until the patches are available.

[–] [email protected] 7 points 4 months ago* (last edited 4 months ago)

There's no 100% way until the new microcode is released next month. All affected CPUs are at risk of silicon degradation by the excessive voltage.

The are some power limits and July bios updates you can use that Intel says can help reduce the damage or prevent it entirely in some scenarios. I believe the damage is specifically caused by single threaded spikes, so reducing LLC and running something like prime95 in the background might hold the voltage low enough that it won't happen. But there is no fix yet, so if your CPU is susceptible, running it will degrade the CPU, at least until the fix is out.

[–] [email protected] 2 points 4 months ago* (last edited 4 months ago)

If you can avoid using a new one, I would. I would not buy or use an unused 13th gen or 14th gen Intel CPU until Intel completes their updates.

In my case, there was a period of time where I had an old, damaged 13th gen CPU, and a new, unused 14th gen.

I was always able to use my damaged CPUs without problems as long as I booted up Linux and told it to use only one core (maxcpus=1 on the GRUB command line passed to the kernel). Even two cores enabled, and it couldn't even boot towards the end, but I never saw corruption with one.

If I could rewind time, I would continue to use my old CPU and avoid using the new one. I would add maxcpus=1 to my Linux command line (to do it every boot, edit /etc/default/grub, runsudo update-grub on Debian-family systems). And I'd use the damaged CPU on a single core until I know that Intel has a workaround in microcode, my motherboard has the relevant BIOS update applied, and then l'd swap in the replacement CPU).

If I didn't have a known-damaged CPU, just have a still-working 13th or 14th gen processor and could get by using an old desktop or laptop or something until the update is out, I'd probably do that if at all possible, so that I don't incur damage.

load more comments (1 replies)
[–] [email protected] 10 points 4 months ago* (last edited 4 months ago)

Vote with your wallet and don't ever get anything from this piece of shit trash ass company again. What a joke. They aren't even stopping selling them KNOWING there's an issue. Wish I had money to sue the fuck out of them.

[–] BeatTakeshi 8 points 4 months ago (2 children)

What did they mess from 12th?

[–] Cort 14 points 4 months ago (2 children)

13th & 14th Gen were just higher voltage and clock speed and boost time limit versions of 12th Gen. It seems like they just over did it

[–] CatZoomies 7 points 4 months ago

Holy crap I barely escaped. I needed an upgrade years ago and settled on the i7-12700k. After I ride this chip out I’m switching to AMD.

I really hope customers get justice in this debacle. We need a lawsuit now.

[–] [email protected] 4 points 4 months ago

Money grab because they didn't have anything new to actually bring to the table this time.

[–] PM_Your_Nudes_Please 6 points 4 months ago* (last edited 4 months ago) (1 children)

13th and 14th gen are literally the exact same hardware as 12th gen, but with boosted clock speeds and power requirements. Basically, intel is struggling to develop new hardware, as they’re beginning to be limited by things like atom size and the speed of light across the width of the chip. So instead of developing new hardware, they just slapped new code onto the 12th gen chips and called them a new generation.

But they made the rookie mistake of not adequately dealing with heat dissipation (which is easy to make when overclocking,) and chips are burning out.

[–] [email protected] 5 points 4 months ago* (last edited 4 months ago) (1 children)

I don't think that the voltage issue is simply heat, not unless it is some kind of extremely-localized or extremely-short-in-time issue internal to the chip. I hit the problem with a very hefty water cooler that didn't let the attached processor ever get very warm, at least as the processor reported temperatures.

Wendell, at Level1Techs, who did an earlier video with Steve Burke talking about this, looked over a dataset of hundreds of machines. They were running with conservative speed settings, in a datacenter where all temperatures were being logged, and he said that the hottest he ever saw on any hotspot on any processor in his dataset was, IIRC, 85 degrees Celsius, and normally they were well below that. He saw about a 50% failure rate.

If we hit the problem on our well-cooled CPUs, if the CPU simply getting hot were a problem, I'd have expected people running them in hotter environments to have slammed into the thing immediately. Ditto for Intel -- I'd guess (I'd hope) that part of their QA cycle involves running the processors in an industrial oven, as a way to simulate more-serious conditions. Those things are supposed to be fine at 100 degrees Celsius, at which point they throttle themselves.

load more comments (1 replies)
[–] [email protected] 7 points 4 months ago

So glad I spent like $2K on a computer with one of these in it that has custom firmware and BIOS on it. Guess I'm just fucked eh? Never buying Intel ever again.

[–] [email protected] 5 points 4 months ago

It really sucks that the only chips that support openbios and custom firmware is from such a shit company like this.

WHY will AMD and Nvidia not support it? I'm running out of options. Guess I'll just stick with old ass computers from now on.

FUCK Intel.

[–] Sanctus 3 points 4 months ago (4 children)

I have one in the box from Christmas. Kinda scared to use it.

[–] [email protected] 8 points 4 months ago* (last edited 4 months ago) (2 children)

If I had a known unused one, I would absolutely not use it until Intel finishes putting out their patch to motherboards to address this. You have no idea whether you could cause damage that won't be detected, leaving you with a slightly damaged processor that malfunctions occasionally.

Intel may publish guidance on how to use unpatched processors. If they don't -- they sure have not been forthcoming with information thus far -- here's my own suggestion.

When I do use it, I would, prior to booting any OS on the CPU, go into the BIOS and turn everything related to the CPU to minimal performance. Memory speed down, disable Intel turbo boost, everything. If you can disable cores there, disable all but one -- even my severely-damaged pair of CPUs could still boot without corrupting my root filesystem as long as I ran using only a single core (though two cores induced problems), and I'd take that as an argument in favor of one core being preferable, though I cannot say for sure that doing so helps avoid damaging the chip rather then just avoiding being affected by the damage once incurred.

And the first thing I'd do, booted into that minimal-performance-CPU-environment, would be to do that motherboard BIOS update. Then go back and reset the motherboard to defaults and use the thing normally.

Maybe that's over-cautious, but we know that the processors destroy themselves with use, and we have no idea what the minimum amount of time -- if any -- to incur damage is. Unless Intel can come out with some kind of diagnostic to reliably detect damaged CPUs, you won't know if you damaged your CPU in that window before the BIOS update, and it is maybe occasionally corrupting data, which I'd guess is a situation that you probably don't want to be in during the lifetime of the CPU.

[–] [email protected] 3 points 4 months ago (1 children)

Some motherboards can update the BIOS without a CPU installed. Look for a BIOS flash button on the motherboard.

[–] [email protected] 2 points 4 months ago

If viable for someone's particular situation, that sounds like an even better suggestion than what I offered.

load more comments (1 replies)
load more comments (3 replies)
[–] [email protected] 2 points 4 months ago (2 children)

I thought I read that Intel said this was from messing with voltages? I have had plenty of these processors in the last couple of years and never experienced crashes, but I don’t overclock

[–] [email protected] 4 points 4 months ago

Nah that was Intel trying to make excuses so it seemed like a competent company.

[–] [email protected] 4 points 4 months ago

That was one initial theory, but it's known to not be the cause. An earlier video that Steve Burke and Wendell from A1techs did had Wendell examine several hundred CPUs that were running in servers on non-Z790 motherboards (another source of potential problems that was initially blamed) at conservative settings, known and logged temperature for the lifetime of the server (so not temperature). He still saw about a 50% failure rate.

I also personally destroyed one of my CPUs with motherboard default settings, and the other with Intel's recommended settings (less aggressive than the motherboard defaults), so I can personally attest to this not just being people running with crazy voltages or something.

There may also be other issues that people have caused by doing something else, but the elephant in the room has been narrowed down to processors destroying themselves while running well within spec.

load more comments
view more: next ›