this post was submitted on 25 May 2024
12 points (77.3% liked)
AI
4006 readers
1 users here now
Artificial intelligence (AI) is intelligence demonstrated by machines, unlike the natural intelligence displayed by humans and animals, which involves consciousness and emotionality. The distinction between the former and the latter categories is often revealed by the acronym chosen.
founded 3 years ago
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
I would ignore the people who say you should deploy a model from someone else as that will teach you next to nothing about how this stuff works.
I would start with an older model and framework (e.g. scikitlearn) and go through all the processing, prediction, and evaluation steps using a model that's fairly simple to understand. Since you already know about linear regression, start with some of these linear models.
Then, and only then, would I worry about neural networks and deep learning, since the main difference is a non-linear activation function and a much more complicated set of weights (model parameters in the linear regression language).
Here is an example
Source: PhD in neural networks
You're right. I read past the "I want to learn ML" and went straight to "do something useful with the data".
If the goal is to understand how modern LLMs work, it's also good to read up on RNNs and LSTMs. For this, 3Blue1Brown does an amazing job, and even posted an in-depth video about transformers. I'd watch that next, followed by implementing a simple transformer in PyTorch (perhaps using the existing blocks).
You could argue that it's important to design everything from scratch first, but it's easier to first go high level, see how the network behaves, and then attempt to implement it yourself based on the paper. It is up to OP how comfortable he is with the topic though ๐