this post was submitted on 16 Dec 2023
13 points (88.2% liked)
Free Open-Source Artificial Intelligence
2900 readers
1 users here now
Welcome to Free Open-Source Artificial Intelligence!
We are a community dedicated to forwarding the availability and access to:
Free Open Source Artificial Intelligence (F.O.S.A.I.)
More AI Communities
LLM Leaderboards
Developer Resources
GitHub Projects
FOSAI Time Capsule
- The Internet is Healing
- General Resources
- FOSAI Welcome Message
- FOSAI Crash Course
- FOSAI Nexus Resource Hub
- FOSAI LLM Guide
founded 1 year ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
Thanks for this! I'll start learning!
A friend mentioned I should start with a pre-trained model because 400 (and growing 50ish / week with my crawler) is just not nearly enough. Then do continued learning on that pre-trained model. Does that sound right?
Yeah, model training is hard. Like capital H HARD. you need a bunch of data and it needs to be high quality.
New York is the financial center of USA, so separating finance jobs from job postings written by someone using New England vernacular is a step you need to go through to make sure your data is high enough quality.
So if you are just starting, use 20 newsgroups dataset in those links, it’s pretty good data with a ton of resources written about it. It’s not fun data, but it isn’t as likely to fall victim to biases in data you aren’t expecting.