this post was submitted on 16 Dec 2023
13 points (88.2% liked)

Free Open-Source Artificial Intelligence

2896 readers
1 users here now

Welcome to Free Open-Source Artificial Intelligence!

We are a community dedicated to forwarding the availability and access to:

Free Open Source Artificial Intelligence (F.O.S.A.I.)

More AI Communities

LLM Leaderboards

Developer Resources

GitHub Projects

FOSAI Time Capsule

founded 1 year ago
MODERATORS
 

I am absolutely new to AI/ML and need some guidance/direction.

Every "New to AI, try this" guide I find ends up going down a path that isn't right for the project I'm working on - or convoluted with so many terms I need to look up, I get rather frustrated. Maybe I'm too old to learn/use AI? Anyway . . .

This is my project, and any guidance, pointers, help would be super appreciated. I'm working on a job aggregator. I have a simple web crawler that goes to a url, fetches the HTML, cleans a lot of the text and structure, and outputs the content of the job posting.

I then go in manually, look at that simplified HTML and extract the actual job description (vs Company description, benefits, other stuff on a job posting) to be used in another database. I use the exact wording, straight copy and paste, no summarization or interpretation.

I have about 400 data points in a database that look like this: job_site: "COMPANY_NAME", raw_html: "Job TitleThis is what we doWe are looking for someone who" job_description: "We are looking for someone who" That I've manually extracted. I feel like I can use that as training data to do some form of text . . . extraction ?? . . . from an html document. But I don't have any clue on where to start

you are viewing a single comment's thread
view the rest of the comments
[–] DrakeRichards 4 points 11 months ago (1 children)

Are you using this as a project to learn about machine learning, or are you trying to use machine learning to solve this project? I truthfully don’t know much about the inner workings of ML, but this project seems like something that’s already very doable without ML.

[–] Loopedcandle 3 points 11 months ago

Honestly, a bit of both. Frustratingly enough, because every hiring manager writes them differently, it takes a human (or hopefully ML) to determine what part of the post is the job description. There is a lot of "Why you should work with us" or "We have ping pong tables" or "Great benefits" - none of which actually describe what the position is. All of these jobs usually have a paragraph (or sometimes a bulleted list) with a job description, but it is pretty different every time.