this post was submitted on 18 Nov 2024

22 points (100.0% liked)

TechTakes

1480 readers

204 users here now

Big brain tech dude got yet another clueless take over at HackerNews etc? Here's the place to vent. Orange site, VC foolishness, all welcome.

This is not debate club. Unless it’s amusing debate.

For actually-good tech, you want our NotAwfulTech community

founded 1 year ago

MODERATORS

[email protected]

Stubsack: weekly thread for sneers not worth an entire post, week ending 24th November 2024 (awful.systems)

submitted 1 month ago by [email protected] to c/[email protected]

181 comments fedilink hide all child comments

Need to let loose a primal scream without collecting footnotes first? Have a sneer percolating in your system but not enough time/energy to make a whole post about it? Go forth and be mid: Welcome to the Stubsack, your first port of call for learning fresh Awful you’ll near-instantly regret.

Any awful.systems sub may be subsneered in this subthread, techtakes or no.

If your sneer seems higher quality than you thought, feel free to cut’n’paste it into its own post — there’s no quota for posting and the bar really isn’t that high.

The post Xitter web has spawned soo many “esoteric” right wing freaks, but there’s no appropriate sneer-space for them. I’m talking redscare-ish, reality challenged “culture critics” who write about everything but understand nothing. I’m talking about reply-guys who make the same 6 tweets about the same 3 subjects. They’re inescapable at this point, yet I don’t see them mocked (as much as they should be)

Like, there was one dude a while back who insisted that women couldn’t be surgeons because they didn’t believe in the moon or in stars? I think each and every one of these guys is uniquely fucked up and if I can’t escape them, I would love to sneer at them.

Last week's thread

(Semi-obligatory thanks to @dgerard for starting this)

(page 2) 50 comments

sorted by: hot top controversial new old

[–] [email protected] 10 points 3 weeks ago

Stack overflow now with the sponsored crypto blogspam Joining forces: How Web2 and Web3 developers can build together

I really love the byline here. "Kindest view of one another". Seething rage at the bullshittery these "web3" fuckheads keep producing certainly isn't kind for sure.

[–] [email protected] 10 points 3 weeks ago (6 children)

Dude discovers that one LLM model is not entirely shit at chess, spends time and tokens proving that other models are actually also not shit at chess.

The irony? He's comparing it against Stockfish, a computer chess engine. Computers playing chess at a superhuman level is a solved problem. LLMs have now slightly approached that level.

For one, gpt-3.5-turbo-instruct rarely suggests illegal moves,

Writeup https://dynomight.net/more-chess/

HN discussion https://news.ycombinator.com/item?id=42206817

[–] [email protected] 10 points 3 weeks ago

Particularly hilarious at how thoroughly they're missing the point. The fact that it suggests illegal moves at all means that no matter how good it's openings are the scaling laws and emergent behaviors haven't magicked up an internal model of the game of Chess or even the state of the chess board it's working with. I feel like playing games is a particularly powerful example of this because the game rules provide a very clear structure to model and it's very obvious when that model doesn't exist.

[–] [email protected] 9 points 3 weeks ago

@gerikson @BlueMonday1984 the only analysis of computer chess anybody needs https://youtu.be/DpXy041BIlA?si=a1vU3zmOWs8UqlSQ

[–] [email protected] 9 points 3 weeks ago* (last edited 3 weeks ago)

I remember when several months (a year ago?) when the news got out that gpt-3.5-turbo-papillion-grumpalumpgus could play chess around ~1600 elo. I was skeptical the apparent skill wasn't just a hacked-on patch to stop folks from clowning on their models on xitter. Like if an LLM had just read the instructions of chess and started playing like a competent player, that would be genuinely impressive. But if what happened is they generated 10^12 synthetic games of chess played by stonk fish and used that to train the model- that ain't an emergent ability, that's just brute forcing chess. The fact that larger, open-source models that perform better on other benchmarks, still flail at chess is just a glaring red flag that something funky was going on w/ gpt-3.5-turbo-instruct to drive home the "eMeRgEnCe" narrative. I'd bet decent odds if you played with modified rules, (knights move a one space longer L shape, you cannot move a pawn 2 moves after it last moved, etc), gpt-3.5 would fuckin suck.

Edit: the author asks "why skill go down tho" on later models. Like isn't it obvious? At that moment of time, chess skills weren't a priority so the trillions of synthetic games weren't included in the training? Like this isn't that big of a mystery...? It's not like other NN haven't been trained to play chess...

load more comments (3 replies)

[–] [email protected] 10 points 3 weeks ago (2 children)

Never thought I'd die fighting alongside a League of Legends fan.

How about an artist valuer?

Aye. That I could do.

load more comments (2 replies)

[–] [email protected] 9 points 3 weeks ago

caption: """AI is itself significantly accelerating AI progress"""

wow I wonder how you came to that conclusion when the answers are written like a Fallout 4 dialogue tree

"YES!!!"
"Yes!!"
"Yes."
" (yes)"

[–] [email protected] 9 points 3 weeks ago (1 children)

Strap in and start blasting the Depeche Mode.

load more comments (1 replies)

[–] [email protected] 9 points 3 weeks ago (1 children)

So we have this new tech that makes stuff up and also is a bit racist at times? Lets use it to monitor employees, of course it also trains to replace your job.

[–] [email protected] 9 points 3 weeks ago (1 children)

This is fucked even without the hallucinating Clippy in the backend.

If your desktop is idle for more than 30-60 seconds (no "meaningful" mouse & keyboard movement), you get a red flag

People getting ~~flogged~~ flagged for being lazy for a few seconds reminds me of something …

load more comments (1 replies)

load more comments