I don't understand what is exactly being verified there? Model integrity? Factors for "reasoning"?
Technology
This is a most excellent place for technology news and articles.
Our Rules
- Follow the lemmy.world rules.
- Only tech related content.
- Be excellent to each other!
- Mod approved content bots can post up to 10 articles per day.
- Threads asking for personal tech support may be deleted.
- Politics threads may be removed.
- No memes allowed as posts, OK to post as comments.
- Only approved bots from the list below, to ask if your bot can be added please contact us.
- Check for duplicates before posting, duplicates may be removed
- Accounts 7 days and younger will have their posts automatically removed.
Approved Bots
Integrity of the model, inputs, and outputs, but with the potential to hide either the inputs or the model and maintain verifiability.
Definitely not reasoning, that's a whole can of worms.
But what is meant by "integrity of the model, inputs and outputs"?
I guess I don't understand the attack vector, what's the threat here? Someone messes with the model file or refines a model towards a specific malicious bias like inserting scam links where legit links would go and passes it off as the real deal?
I'm more general cybersec than crypto so idk but isn't that what hash sums are for?
Surely if someone messed with my .ckpt or .safetensors it won't be the same file anymore?
And what does that have to do with validity of the inputs?
Isn't this still subject to the same problem, where a system can lie about its inference chain by returning a plausible chain which wasn't the actual chain used for the conclusion? (I'm thinking from the perspective of a consumer sending an API request, not the service provider directly accessing the model.)
Also:
Any time I see a highly technical post talking about AI and/or crypto, I imagine a skilled accountant living in the middle of mob territory. They may not be directly involved in any scams themselves, but they gotta know that their neighbors are crooked and a lot of their customers are gonna use their services in nefarious ways.
The model that is doing the inference is committed to before hand (it's hashed) so you can't lie about what model produced the inference. That is how ezkl, the underlying library, works.
I know a lot of people in this cryptography space, and there are definitely scammers across the general "crypto space", but in the actual cryptography space most people are driven by curiosity or ideology.
I appreciate the reply! And I'm sure I'm missing something, but... Why can't you just lie about the model you used?
Ahh, ya, so this is a deep rabbit hole but I will try to explain best I can.
Zero knowledge is a cryptographic way of proving that some computation was done correctly. This allows you to "hide" some inputs if you want.
In the context of the "ezkl" library, this allows someone to train a model and publicly commit to it by posting a hash of the model somewhere, and someone else can run inference on that model, and what comes out is the hash of the model and the output of the inference along with a cryptographic "proof" that anyone can verify that the computation was indeed done with that model and the result was correct, but the person running the inference could hide the input.
Or let's say you have a competition for whoever can train the best classifier for some specific task. I could train a model and when I run it the test set inputs could be public, and I could "hide" the model but the zk computation would still reveal the hash of the model. So let's say I won this competition, I could at the end reveal the model that I tried, and anyone would be able to check that the model I revealed and the model that was ran that beat everyone else was in fact the same model.
Hey can someone dumb down the dumbed down explanation for me please?
AI is a magical black box that performs a bunch of actions to produce an output. We can’t trust what a developer says the black box does inside without it being completely open source (including weights).
This is a concept for a system where the actions performed can be proved to those who don’t have visibility inside the box to trust the box is doing what it is saying it’s doing.
An AI enemy that can prove it isn’t cheating by providing proof of the actions it took. In theory.
Zero Knowledge Proofs make a lot of sense for cryptography but in a more abstracted sense like this, it still relies on a lot of trust that the implementation generates proofs for all actions.
Whenever I see Web3, I personally lose any faith in whatever is being presented or proposed. To me, blockchain is an impressive solution to no real problem (except perhaps border control / customs).
Zk in this context allows someone to be able to thoroughly test a model and publish the results with proof that the same model was used.
Blockchain for zk-ml is actually a great use case for 2 reasons:
- it's a public immutable database where people can commit to the hash of some model they want to hide.
- It allows someone with a "model" (that doesn't have to be a neural net, it could be some statistical computation) and verifier to do work for others for a fee. Let's say I have a huge data set of property values/data for some given area, and I'm a real estate agent, and I want to have other people run some crazy computation on it to predict which houses will likely sell first in the next 30 days. I could post this challenge online with the data, other people could run models against that data and post their results (but not how they got them) on chain. In 30 days the real estate agent could publish the updated data and reward the best performer, and potentially "buy" their model. You could do this with a centralized service, but they would likely take a fee, keep things proprietary, and likely try to make some shady back room deals. This removes the middleman.
The way AI is trained today creates a black box solution, the author says only the developers of the model know what goes on inside the black box.
This is major pain point in AI, where we are trying to understand it so we can make it better and more reliable. The author mentions that unless AI companies open source their work, it's impossible for everyone else to 'debug' the circuit.
Zero knowledge proofs are how they are trying to combat this, using mathematical algorithms they are trying to verify the output of an AI model in real time, without having to know the underlying intellectual property.
This could be used to train AI further and increase the reliability of AI drastically, so it could be used to make more important decisions and adhere much more easily to the strategies for which they are deployed.
Thanks for the 'for dummies' explanation.