this post was submitted on 22 Jul 2023
131 points (85.8% liked)
Asklemmy
44265 readers
1573 users here now
A loosely moderated place to ask open-ended questions
Search asklemmy π
If your post meets the following criteria, it's welcome here!
- Open-ended question
- Not offensive: at this point, we do not have the bandwidth to moderate overtly political discussions. Assume best intent and be excellent to each other.
- Not regarding using or support for Lemmy: context, see the list of support communities and tools for finding communities below
- Not ad nauseam inducing: please make sure it is a question that would be new to most members
- An actual topic of discussion
Looking for support?
Looking for a community?
- Lemmyverse: community search
- sub.rehab: maps old subreddits to fediverse options, marks official as such
- [email protected]: a community for finding communities
~Icon~ ~by~ ~@Double_[email protected]~
founded 5 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
To the second question it's not novel at all. The models used were invented decades ago. What changed is Moores Law striked and we got stronger computational power especially graphics cards. It seems that there is some resource barrier that when surpassed turns these models from useless to useful.
Not the specific models unless I've been missing out on some key papers. The 90s models were a lot smaller. A "deep" NN used to be 3 or more layers and that's nothing today. Data is a huge component too
The specifics are a bit different, but the main ideas are much older than this, I'll leave here the Wikipedia
"Frank Rosenblatt, who published the Perceptron in 1958,[10] also introduced an MLP with 3 layers: an input layer, a hidden layer with randomized weights that did not learn, and an output layer.[11][12] Since only the output layer had learning connections, this was not yet deep learning. It was what later was called an extreme learning machine.[13][12]
The first deep learning MLP was published by Alexey Grigorevich Ivakhnenko and Valentin Lapa in 1965, as the Group Method of Data Handling.[14][15][12]
The first deep learning MLP trained by stochastic gradient descent[16] was published in 1967 by Shun'ichi Amari.[17][12] In computer experiments conducted by Amari's student Saito, a five layer MLP with two modifiable layers learned internal representations required to classify non-linearily separable pattern classes.[12]
In 1970, Seppo Linnainmaa published the general method for automatic differentiation of discrete connected networks of nested differentiable functions.[3][18] This became known as backpropagation or reverse mode of automatic differentiation. It is an efficient application of the chain rule derived by Gottfried Wilhelm Leibniz in 1673[2][19] to networks of differentiable nodes.[12] The terminology "back-propagating errors" was actually introduced in 1962 by Rosenblatt himself,[11] but he did not know how to implement this,[12] although Henry J. Kelley had a continuous precursor of backpropagation[4] already in 1960 in the context of control theory.[12] In 1982, Paul Werbos applied backpropagation to MLPs in the way that has become standard.[6][12] In 1985, David E. Rumelhart et al. published an experimental analysis of the technique.[7] Many improvements have been implemented in subsequent decades.[12]"
The idea of NN or the basis itself is not AI. If you had actual read D. E. Rumelhart, G. E. Hinton, and R. J. Williams, βLearning Internal Representations by Error Propagation.β Sep. 01, 1985. then you would understand this bc that paper is about a machine learning technique not AI. If you had done your research properly instead of just reading wikipedia, then you would have also come across autoassociative memory which is the precursor to autoencoders and generative autoencoders which is the foundation of a lot of what we now think of as AI models. H. Abdi, βA Generalized Approach For Connectionist Auto-Associative Memories: Interpretation, Implication Illustration For Face Processing,β in In J. Demongeot (Ed.) Artificial, University Press, 1988, pp. 151β164.
You don't get to complain about people being condescending to you when you are going around literally copy and pasting wikipedia. Also you're not right, major progress in this field started in the 80s although the concepts were published earlier, they were basically ignored by researchers. You're making it sound like the NNs we're using now are the same as the 60s when in reality our architectures and just even how we approach the problem have changed significantly. It's not until the 90s-00s that we started getting decent results that could even match older ML techniques like SVM or kNN.