TechTakes

1620 readers

129 users here now

Big brain tech dude got yet another clueless take over at HackerNews etc? Here's the place to vent. Orange site, VC foolishness, all welcome.

This is not debate club. Unless it’s amusing debate.

For actually-good tech, you want our NotAwfulTech community

founded 2 years ago

MODERATORS

[email protected]

DeepSeek roundup: banned by governments, no guard rails, lied about its training costs (pivot-to-ai.com)

submitted 4 days ago by [email protected] to c/[email protected]

56 comments fedilink hide all child comments

you are viewing a single comment's thread
view the rest of the comments

[–] [email protected] 11 points 3 days ago* (last edited 3 days ago) (2 children)

consider this paragraph from the Wall Street Journal:

DeepSeek said training one of its latest models cost $5.6 million, compared with the $100 million to $1 billion range cited last year by Dario Amodei, chief executive of the AI developer Anthropic, as the cost of building a model.

you're arguing to me that they technically didn't lie -- but it's pretty clear that some people walked away with a false impression of the cost of their product relative to their competitors' products, and they financially benefitted from people believing in this false impression.

[–] [email protected] 2 points 2 days ago (1 children)

Okay I mean, I hate to somehow come to the defense of a slop company? But WSJ saying nonsense is really not their fault, like even that particular quote clearly says "DeepSeek said training one" cost $5.6M. That's just a true statement. No one in their right mind includes the capital expenditure in that, the same way when you say "it took us 100h to train a model" that doesn't include building a data center in those 100h.

Beside whether they actually lied or not, it's still immensely funny to me that they could've just told a blatant lie nobody factchecked and it shook the market to the fucking core wiping off like billions in valuation. Very real market based on very real fundamentals run by very serious adults.

[–] [email protected] 3 points 2 days ago* (last edited 2 days ago) (1 children)

i can admit it's possible i'm being overly cynical here and it is just sloppy journalism on Raffaele Huang/his editor/the WSJ's part. but i still think that it's a little suspect on the grounds that we have no idea how many times they had to restart training due to the model borking, other experiments and hidden costs, even before things like the necessary capex (which goes unmentioned in the original paper -- though they note using a 2048-GPU cluster of H800's that would put them down around $40m). i'm thinking in the mode of "the whitepaper exists to serve the company's bottom line"

btw announcing my new V7 model that i trained for the $0.26 i found on the street just to watch the stock markets burn

[–] [email protected] 4 points 2 days ago

but i still think that it’s a little suspect on the grounds that we have no idea how many times they had to restart training due to the model borking, other experiments and hidden cost

Oh ye, I totally agree on this one. This entire genAI enterprise insults me on a fundamental level as a CS researcher, there's zero transparency or reproducibility, no one reviews these claims, it's a complete shitshow from terrible, terrible benchmarks, through shoddy methodology, up to untestable and bonkers claims.

I have zero good faith for the press, though, they're experts in painting any and all tech claims in the best light possible like their lives fucking depend on it. We wouldn't be where we are right now if anyone at any "reputable" newspaper like WSJ asked one (1) question to Sam Altman like 3 years ago.

[–] [email protected] -2 points 3 days ago (1 children)

but it's pretty clear that some people walked away with a false impression of the cost of their product relative to their competitors' products

Ask yourself why that may be, as you are the one who posted a link to a WSJ article that is repeating an absurd 100m-1b figure from a guy who has a vested interest in making the barrier of entry into the field seem as high as possible the increase the valuation of his company. Did WSJ make an attempt to verify the accuracy of these statements? Did it push for further clarification? Did it compare those statements to figures that have been made public by Meta and OpenAI? No on all counts - yet somehow "deepseek lied" because it explicitly stated their costs didn't include capex, salaries, or R&D, but the media couldn't be bothered to read to the end of the paragraph

[–] [email protected] 6 points 3 days ago (1 children)

"the media sucks at factchecking DeepSeek's claims" is... an interesting attempt at refuting the idea that DeepSeek's claims aren't entirely factual. beyond that, intentionally presenting true statements that lead to false impressions is a kind of dishonesty regardless. if you mean to argue that DeepSeek wasn't being underhanded at all and just very innocently presented their figures without proper context (that just so happened to spurn a media frenzy in their favor)... then i have a bridge to sell you.

besides that, OpenAI is very demonstrably pissing away at least that much money every time they add one to the number at the end of their slop generator

[+] [email protected] -6 points 3 days ago* (last edited 3 days ago) (2 children)

"the media sucks at factchecking DeepSeek's claims" is... an interesting attempt at refuting the idea that DeepSeek's claims aren't entirely factual.

That's the opposite of what I'm saying. Deepseek is the one under scrutiny, yet they are the only one to publish source code and training procedures of their model. So far the only argument against them is "if I read the first half of a sentence in deepseeks whitepaper and pretend the other half of the sentence doesn't exist, I can generate a newsworthy headline". So much so that you just attempted to present a completely absurd and unverifiable number from a guy with a financial incentive to exaggerate, and a non apples-to-apples comparison made by WSJ as airtight evidence against them. OpenAI allegedly has enough hardware to invalidate deepseeks training claims in roughly five hours - given the massive financial incentive to do so, if deepseek was being untrustworthy, you don't think they would have done so by now?

if you mean to argue that DeepSeek wasn't being underhanded at all and just very innocently presented their figures without proper context (that just so happened to spurn a media frenzy in their favor)... then i have a bridge to sell you.

What do you mean proper context? I posted their full quote above, they presented their costs with full and complete context, such that the number couldn't be misconstrued without one being willfully ignorant.

OpenAI is very demonstrably pissing away at least that much money every time they add one to the number at the end of their slop generator

It sounds to me like you have a very clear bias, and you don't care at all about whether or not what they said is actually true or not, as long as the headlines about AI are negative

[–] [email protected] 10 points 3 days ago

this is utterly pointless and you’ve taken up way too much space in the thread already

It sounds to me like you have a very clear bias, and you don’t care at all about whether or not what they said is actually true or not, as long as the headlines about AI are negative

oh no, anti-AI bias in TechTakes? unthinkable

[–] [email protected] 6 points 3 days ago* (last edited 3 days ago)

That's the opposite of what I'm saying. Deepseek is the one under scrutiny, yet they are the only one to publish source code and training procedures of their model.

this has absolutely fuck all to do with anything i've said in the slightest, but i guess you gotta toss in the talking points somewhere

e: it's also trivially disprovable, but i don't care if it's actually true, i only care about headlines negative about AI