this post was submitted on 03 Jun 2024
1475 points (98.0% liked)

People Twitter

5291 readers
2272 users here now

People tweeting stuff. We allow tweets from anyone.

RULES:

  1. Mark NSFW content.
  2. No doxxing people.
  3. Must be a tweet or similar
  4. No bullying or international politcs
  5. Be excellent to each other.

founded 1 year ago
MODERATORS
 
you are viewing a single comment's thread
view the rest of the comments
[–] [email protected] 3 points 5 months ago* (last edited 5 months ago)

Python web scraping is just fine, with the llms you.have the option of either extracting the html and having the LLM read.over that, or having a vision ai OCR the page and make its own decision of what to extract.