this post was submitted on 03 Jun 2024
1475 points (98.0% liked)

People Twitter

5190 readers
2274 users here now

People tweeting stuff. We allow tweets from anyone.

RULES:

  1. Mark NSFW content.
  2. No doxxing people.
  3. Must be a tweet or similar
  4. No bullying or international politcs
  5. Be excellent to each other.

founded 1 year ago
MODERATORS
 
you are viewing a single comment's thread
view the rest of the comments
[โ€“] [email protected] 4 points 5 months ago (1 children)

That's precisely what I was thinking, but reflecting more on it, I don't know how well it would handle the webpages, so maybe some other languages mixed in too (I'm out of date, maybe PHP?). If AI writing code worked it would lower the barrier, but I'm not certain we're quite there yet to trust anything it would create.

[โ€“] [email protected] 3 points 5 months ago* (last edited 5 months ago)

Python web scraping is just fine, with the llms you.have the option of either extracting the html and having the LLM read.over that, or having a vision ai OCR the page and make its own decision of what to extract.