this post was submitted on 03 Jun 2024
1476 points (98.0% liked)
People Twitter
5447 readers
2865 users here now
People tweeting stuff. We allow tweets from anyone.
RULES:
- Mark NSFW content.
- No doxxing people.
- Must be a pic of the tweet or similar. No direct links to the tweet.
- No bullying or international politcs
- Be excellent to each other.
- Provide an archived link to the tweet (or similar) being shown if it's a major figure or a politician.
founded 2 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
The reality is, though, that there are no such APIs. LLMs on the other hand could be a valid tool for the use case.
It's not that there's no API. It's that there's probably a different API for every single grocery store. And they make random changes and don't have public documentation. That's why we need the AI.
Yup, exactly, no standardized APIs.
The stores don't want you to have easy comparable access to their prices.
They'd quite like it if you just came in, saw that the item you wanted is out of stock, and then just buy some shit you didn't need.
Yeah, we're not going to make technology that drives prices down
But they'll happily give you full access to everything they have if you're another corpo and you promise to marginally improve their sales anyhow. That's, sadly, how businesses work.
Indeed. LLMs read with the same sort of comprehension that humans have, so if a supermarket makes their website compatible with humans then it's also compatible with LLMs. We have the same "API", as it were.
Can LLMs interpret structured input like html?
Yup. And those that can't can have a parser pull just the human-readable text out, like a blind person's screen-reader would do.
That sounds like an issue with your system prompt. If you're using an LLM to interpret web pages for price information then you'd want to include instructions about what to do if the information simply isn't in the web page to begin with. If you don't tell the AI what to do under those circumstances you can't expect any specific behaviour because it wouldn't know what it's supposed to do.
I suspect from this comment that you haven't actually worked with LLMs much, and are just going off the general "lol they hallucinate" perception they have right now? I've worked with LLMs a fair bit and they very rarely have trouble interpreting what's in their provided context (as would be the case here with web page content). Hallucinations come from relying on their own "trained" information, which they recall imperfectly and often gets a bit jumbled. To continue using a human analogy, it's like asking someone to rely on their own memory rather than reading information from a piece of paper.
Or you could just prompt it to not guess prices for articles that don't exist. Those models are pretty good at following instructions.
No, that's why we need regulations to enforce standards.
You just need someone to do it. Here in Austria someone did it: https://heisse-preise.io
It's only in German and most of the prices aren't from a public API but crawled from different sources.
It's open source. Nothing except greed is stopping them from providing something like this.
Imagine if instead of building their own bespoke systems, grocery stores (and other places) created an open source software foundation and worked together to produce the software they needed.
I sometimes dream of such things. Less waste, better inventory, customers get to choose inventory based on their wishlist, better prices, then I wake up.
We actually have a small liquor store nearby that really puts stuff on the shelves if you casually mention something you like. But that's more the exception than the rule.
That's impressive, and honestly looks like it was quite a bit of work. I wonder how the author finances himself? There doesn't even seem to be a donation button on the site. I found a lengthy article on Wired but it doesn't appear to mention how he can afford to do all of this for free.
Nothing is stopping anyone from doing this except the amount of work it takes to write and maintain all those data import scripts. I think greed is the wrong word here. It's not unreasonable to expect some sort of monetary reward for providing a useful public service that actually helps people save money. Everyone's gotta eat, right?
Actually, you'd be surprised. Instacart has up-to-date price and product data for TONS of grocery stores. And while their API likely isn't public, they MUST have one in order for their smartphone apps to work.
LLMs are not a good tool for processing data like this. They would be good for presenting that data though.
Make an LLM convert the data into a standardized format for your traditional algorithm.
There's no way to ensure that data will stay in that standardized format though. A custom model could but they are expensive to train.
Llms are excellent at consuming web data.
Not if you want to ensure the validity of the compiled coupons/discounts. A custom algorithm would be best but data standardization would be the main issue, regardless of how you process it.
What does validity mean in this case? A functionary LLM can follow links and make actions. I'm not saying it's not "work" to develop your personal bot framework, but this is all doable from the home PC, with a self hosted llm
Edit and of course you'll need non LLM code to handle parts of the processing, not discounting that
The LLM doesn't do that though, that the software built around it that does that which is what I'm saying. Its definitely possible to do, but the bulk of the work wouldn't be the task of the LLM.
Edit: forgot to address validity. By that I mean keeping a standard format and ensuring that the output is actually true given the input. Its not impossible, but its something that requires careful data duration and a really good system prompt.
Llms are great for scraping data
LLMs don't scrape data, scrapers scrape data. LLMs predict text.
https://youtu.be/fjP328HN-eY?si=quZeZx57fDjBW5EW
Puppeteer and gpt-vision are decidedly not LLMs
👍
Yes there are. You can obtain access to the Kroger API, the Meijer API, the Walmart API, and I'm sure others that I didn't bother to Google. Failing getting access to the actual APIs, there are tons of web scraper projects that just parse those stores' websites for product information, and web scrapers are still orders of magnitude more efficient than LLMs.
Instacart has prices for all of these stores and more. Obviously they're not updating them by hand...
At the cost of huge amounts of wasted energy and the whole litany of concerns that are always co-morbid with AI, but technically yes they could work for this lol. Ideally we'd have standardized APIs and mandated pricing transparency, but unfortunately we live in a capitalist society where that will literally never happen ever.