US chip-maker Nvidia led a rout in tech stocks Monday after the emergence of a low-cost Chinese generative AI model that could threaten US dominance in the fast-growing industry.
The chatbot developed by DeepSeek, a startup based in the eastern Chinese city of Hangzhou, has apparently shown the ability to match the capacity of US AI pace-setters for a fraction of the investments made by American companies.
Shares in Nvidia, whose semiconductors power the AI industry, fell more than 15 percent in midday deals on Wall Street, erasing more than $500 billion of its market value.
The tech-rich Nasdaq index fell more than three percent.
AI players Microsoft and Google parent Alphabet were firmly in the red while Meta bucked the trend to trade in the green.
DeepSeek, whose chatbot became the top-rated free application on Apple's US App Store, said it spent only $5.6 million developing its model -- peanuts when compared with the billions US tech giants have poured into AI.
US "tech dominance is being challenged by China," said Kathleen Brooks, research director at trading platform XTB.
"The focus is now on whether China can do it better, quicker and more cost effectively than the US, and if they could win the AI race," she said.
US venture capitalist Marc Andreessen has described DeepSeek's emergence as a "Sputnik moment" -- when the Soviet Union shocked Washington with its 1957 launch of a satellite into orbit.
As DeepSeek rattled markets, the startup on Monday said it was limiting the registration of new users due to "large-scale malicious attacks" on its services.
Meta and Microsoft are among the tech giants scheduled to report earnings later this week, offering opportunity for comment on the emergence of the Chinese company.
Shares in another US chip-maker, Broadcom, fell 16 percent while Dutch firm ASML, which makes the machines used to build semiconductors, saw its stock tumble 6.7 percent.
"Investors have been forced to reconsider the outlook for capital expenditure and valuations given the threat of discount Chinese AI models," David Morrison, senior market analyst at Trade Nation.
"These appear to be as good, if not better, than US versions."
Wall Street's broad-based S&P 500 index shed 1.7 percent while the Dow was flat at midday.
In Europe, the Frankfurt and Paris stock exchanges closed in the red while London finish flat.
Asian stock markets mostly slid.
Just last week following his inauguration, Trump announced a $500 billion venture to build infrastructure for AI in the United States led by Japanese giant SoftBank and ChatGPT-maker OpenAI.
SoftBank tumbled more than eight percent in Tokyo on Monday while Japanese semiconductor firm Advantest was also down more than eight percent and Tokyo Electron off almost five percent.
Reposting from a similar post, but... I went over to huggingface and took a look at this.
Deepseek is huge. Like Llama 3.3 huge. I haven't done any benchmarking, which I'm guessing is out there, but it surely would take as much Nvidia muscle to run this at scale as ChatGPT, even if it was much, much cheaper to train, right?
So is the rout based on the idea that the need for training hardware is much smaller than suspected even if the operation cost is the same... or is the stock market just clueless and dumb and they're all running on vibes at all times anyway?
Two parts here.
As for the model.
This model is from China and trained there. They have an embargo on the best chips, they can't get them. So they aren't supposed to have the resources to produce what we're seeing with DeepSeek, and yet, here we are. So either someone has slipped them a shipment that's a big no-no OR we take it at face value here that they've found a way to optimize training.
The neat thing about science is reproducibility. So given the paper DeepSeek wrote and the open source nature of this. Someone should be able to sit down and reproduce this in about two month (ish). If they can, nVidia is going to have a completely terrible time and the US is going to have to rethink the whole AI embargo.
Without deep diving into this model and what it spouts, the skinny is that nVidia has their top tier AI GPUs. It has all these parts cut into the silicon that makes creating a model cost a lot less in kilowatts of power. DeepSeek says they were able to put in some optimizations that gets you a model on low kilowatts by optimizing some of the parts found only in the top tier AI GPUs.
Blah blah example of this DeepSeek used 32 of the 132 streaming multiprocessors on their Hopper GPUs to act as a hardware accelerated communication manager and scheduler. Top tier nVidia cards for big farms do this in their hardware already in a circuit called the DPU. Basically DeepSeek found a way to use their Hopper GPUs to do the same function as nVidia's DPUs.
If true, it means that the hardware nVidia is popping into their top tier isn't strictly required. It's nice, and you'll still get a model on less kilowatts than the tricks DeepSeek is using, but DeepSeek's tricks means the price difference between top tier and low tier needs to be a lot closer than it is to stay competitive. As it stands with DeepSeek's tricks (again, if they prove to be correct) is that if you've got a little extra time, you can get bottom tier AI GPUs and spend about the same kilowatts for what the top tier will kick out with a hint less kilowatts. The difference in cost of kilowatts between the amount you spend on low tier and amount you spend on kilowatts on top tier isn't enough to justify the top tier's price difference from the low tier, if time is not a factor.
And so that brings us full circle here. If someone is able to reproduce DeepSeek's gains, nVidia's top tier GPUs are way over priced and their bottom tier is going to sell out like hotcakes. That's bad for nVidia if they were hoping to, IDK, make ridiculous profit. And that is why the sudden spook in the market. I mean, don't get me wrong, folks have been looking forward to popping nVidia's bubble, so they've absolutely been hyping this whole thing up a lot more. And it didn't help that it came top #1 on the Apple App Store.
So some of this is those people riding the hate nVidia train. But some of it is also, well this is interesting if true. I think it's a little early to start victory laps around nVidia's grave. The optimizations purposed by DeepSeek have yet to be verified as accurate. And things are absolutely going to get interesting no matter the outcome. Because if the purposed optimizations don't actually produce the kind of model DeepSeek has, where did they get it from? How did they cheat? Because then that's an interesting question in of itself, because they aren't supposed to have hardware that would allow them to make this. Which could mean a few top tier cards are leaking into China's hands.
But if it all does prove true, well, he he he, nVidia shorts are going to be eating mighty well.
I believe it would have lowerr operational costs assuming the models the only thing different and they target the same size. Deepseek does the "mixture of experts" approach which makes it use a subset of parameters thus making it faster / less computational.
That's said I have a basic understanding of AI so maybe my understanding is flawed.
But the models that are posted right now don't seem any smaller. The full precision model is positively humongous.
They found a way to train it faster. Fine. So they need fewer GPUs and can do it on slower ones that are much, much cheaper. I can see how Nvidia takes a hit on the training side.
But presumably the H100 is still faster than the H800s they used for this and presumably running the resulting model is still just as hard. All the improvements seem like they're on the training side.
Granted, I don't understand what they did and will have to go fishing for experts walking through it in more detail. I still haven't been able to run it myself, etiher, maybe it's still large but runs lighter on processing and that's noticeable. I just haven't seen any measurements of that side of things yet. All the coverage is about how cheap the training was on H800s.
They aren't necessarily smaller from my understanding. Say it has 600B parameters, it more efficiently uses them. You ask it a programming question, it pulls 37B parameters most related to it and responds using those instead of processing all 600B.
Think of it like a model with specialized submodels that a specific one may provide the best answer, and uses it.
Gotcha. Doesn't quite answer the running cost question on the face of it, though. Has anybody published any benchmarks with comparisons? All I see are qualitative benchmarks on the output and that mythical six million figure for the training, but I haven't found anything to justify the "Nvidia is doomed" narrative yet.
Ah yeah I haven't seen anything on that. That'll be next weeks headlines probably lol
600B is comparable to things like Llama 3, but r1 is competing with openAI's o1 model as a chain of thought model. How big that is is classified but its thought that chatGPT4 was already in the trillions and that o1 was a big step beyond that.
Hell, maybe that's the one real outright advantage of this weird panic. Maybe OpenAI is forced to share technical specs of their models again instead of working on a "trust me bro, it's too dangerous for you to know" basis.
Absolutely, the big American tech firms have gotten fat and lazy from their monopolies, actual competition will come as a shock to them.
But that's the thing, there was actual competition. It's not like they weren't competing with each other.
They are freaking out because the competition is Chinese, specifically. I seriously doubt the read of this situation would be that the bottom fell out of AI if one of the usual broligarchs had come up with a cheaper process for training.
Did the US accidentally generate an incentive for that to happen in China by shoddily blocking tensor math accelerators but only the really fancy ones and only kinda sorta sometimes? Sure. But both the fearmongering being used to enforce those limitations and the absolute freakout they are currently having seems entirely disconnected from reality.
Maybe we can go back to treating this as computer science rather than an arms race for a while now.
I dont doubt that a part of it is that they are Chinese, but I think a big part of it is that they are willing to undercut the current players by 10x on price. That has scared the crap out of the "broligarchy" (great term) who are used to everything being cozy and not competing on price with each other, only as a method to drive non-tech companies out of markets.
They see what deepseek is doing as equivalent of what Amazon did in online sales or uber in taxis, an agressive underpricing in order to drive competion out the market.
Yeah, I don't know. I wonder.
The monetization scheme on all these AI applications was always confusing to me. It was certainly not seeking to break even, if their ravenous requests for funding are to be believed. Nobody is really looking at these based on token pricing, beyond other entrepeneurs hoping to make derivative content they can monetize themselves downstream.
They all seem to want to replicate the Google approach of giving stuff for free forever until you have a monopoly and my impression was that it really wasn't working. I still have that impression today, regardless of the cost per token on each platform.
So who knows. I'm also entirely capable of believing that all these idiots sincerely thought they were building the singularity and were complete geniuses and nobody but them could figure it out. I am making no assumptions at this point.
It comes in different versions, some of which are enormous and some that are pretty modest. The thing is they are not competing with 4o, but with o1, which has really massive resource requirements. The big news this week was supposed to be o3 opening up to the public, which involved another huge resource jump, and set a clear trajectory for the future of AI being on an exponential curve of computing power. This was great news for companies that made the parts and who could afford the massive buildout to support future developments. Deepseek isnt so much disruptive for its own capabilities, its disruptive because challenges the necessity of this massive buildout.
I suppose the real change is on the assumption that we were going to have to go all paperclip maximizer on Nvidia GPUs forever and on that front yeah, Nividia would have become marginally less the owner of a planet made from GPUs all the way through.
They're still the owner of a planet made of rock where people run AIs on GPUs, though. Which I guess is worth like 15% less or whatever.
Clueless dumb vibes, yeah. But exaggerated by the media for clicks, too - Nvidia price is currently the same as it was in Sept 2024. Not really a huge deal.
Anyway, the more efficient it is the more usage there will be and in the longer run more GPUs will be needed - https://www.greenchoices.org/news/blog-posts/the-jevons-paradox-when-efficiency-leads-to-increased-consumption
Sure, 15% isn't the worst adjustment we've seen in a tech company by a long shot, even if the absolute magnitude of that loss is absolutely ridiculous because Nvidia is worth all the money, apperently.
But everybody is acting like this is a seismic shock, which is fascinatingly bizarre to me. It seems the normie-investor axis really believed that forcing Nvidia to sell China marginally slower hardware was going to cripple their ability to make chatbots permanently, which I feel everybody had called out as being extremely not the case even before these guys came up with a workaround for some of the technical limitations.
I think it has to do with how much cheaper the Chinese company is offering tokens for. It is severely undercutting the American companies. Going forward they won’t have unlimited cash as they are used to.
But the cost per token would target Microsoft, Meta and Google way more than Nvidia. They still control the infrastructure, the software guys are the ones being uncercut.
Not that I expect the token revenue was generating "unlimited money" anyway, but still.