73
What’s Really Going On in Machine Learning? Some Minimal Models | Stephen Wolfram | August 22, 2024
(writings.stephenwolfram.com)
A community for posting things related to machine learning
Icon base by Lorc under CC BY 3.0 with modifications to add a gradient
I don't think that this critique is focused enough to be actionable. It doesn't take much effort to explain why a neural network made a decision, but the effort scales with the size of the network, and LLMs are quite large, so the amount of effort is high. See recent posts by (in increasing disreputability of sponsoring institution) folks at MIT and University of Cambridge, Cynch.ai, Apart Research, and University of Cambridge, and LessWrong. (Yep, even the LW cultists have figured out neural-net haruspicy!)
I was hoping that your complaint would be more like Evan Miller's Transformers note, which lays out a clear issue in the Transformers arithmetic and gives a possible solution. If this seems like it's over your head right now, then I'd encourage you to take it slowly and carefully study the maths.
Fair enough. I'm still at work so I've only skimmed these so far. I appreciate the feedback and links and I'll definitely look into it more.
I completely agree that my critique isn't focused enough. I slapped that comment together entirely too fast without much deeper thought involved. I have a very surface level understanding of this kinda stuff. Regardless, I do like sharing my opinion from an outsiders perspective. Mostly because I enjoy the discussion. It's always an opportunity to learn something new, even if it ruffles a few feathers along the way. I know that whenever I'm super invested in a topic, no matter what it is, I sometimes get so soaked up in it all that I tend to ignore outside influences.