Oh wait, you were the architect using the agent so you own the responsibility? Isn't that already settled by now. Wasn't it your job to evaluate the agent itself before using it?
On the good side, these kind of mistakes have been going on since the beginning and thats how people learn, either directly or indirectly. Hopefully this should at least help AI to be better and the people to be better at using AI
This is not something we support currently. We will need to do some research on ways to support it.
The main hurdle is that we can't rewrite secrets in any of the user buffers as this will defy our threat model and signing is usually done in user space.
You are already doing a MITM, so some one is placing the trust in you as a intermediary. In reality the content distribution networks fronting any of the API operations have already muddied the water at this point. You are well into your rights to recalculate the signature for the payload and replace it with the secret key.
yes I agree and we actually already do that for TLS when rewriting secrets after encryption but my point is about the fact in our threat model we consider the app as an adversary so we don't want to use any of its buffers to rewrite secrets because it would be trivial for an adversary to reread the buffer after rewrite and get the secret. The way we overcome this is by listening to the user buffer recording all the data we need to rewrite the secret without writing anything. We go back later in the kernel buffer meant to be sent to the network and not accessible to the user app and perform the rewrite.
For API keys used to sign the request we need to do something similar which could be challenging within ebpf (maybe doable I'm not sure)
This is a good start, it does covers gaps in certain areas. There are few more areas I can think of
1. The end point matters, example if the credential is OAuth2 token and service has a token refresh endpoint then the response would have a new token in the payload reaching directly to the agent
2. Not all the end points are made the same even on the service side, some may not even require credential, the proxy may end up leaking the credential to such endpoints
3. The proxy is essentially doing a MITM at this point, it just increased its scope to do the certificate validation as well, to do it correctly is a hard problem
4. All credentials are stored on a machine, it requires a lot more access & authorization framework in terms of who can access the machine now. One might think that they closed a security gap and soon they realize that they opened up couple more in that attempt
Thanks for this feedback! Will keep in mind all of these points as we iterate on Agent Vault.
We're pretty swarmed on requests at the moment but I've noted these down as improvements to AV; it's a work in progress, we'll be molding it into the right shape over the next few months.
A few thoughts for each of the above:
1. AV doesn't consider OAuth2 tokens atm but this is definitely a next step.
2. Agree which is why there is a "passthrough" mode; for each endpoint, you need to explicitly specify what credential is used for it.
3. That's correct. This is a MITM architecture with credential brokering capabilities added on top.
4. Agree. The idea here is that AV can function both as a proxy and vault but in a true production setting, it should pull credentials from a secure secrets store like Infisical. This way credentials cached in memory in AV can even be made ephemeral.
Great observations all around and we have plans for them :)
I read the original article, then the detailed statement and then this article to better understand what happened. I might consider myself as some one who has fairly good understanding of security flows. Here is my take:
1. The security flows are half baked and custom implemented, they do not present a coherent story
2. No one fully understands the ecosystem as a whole and so far no one has been able to track what actually happened, adding audit logs were not part of the product ask so no one ever added them in thoroughness
If I have to put my money then its the second one. The possible down the road action, at the most this incident would trigger more security engineers to be hired which may give the impression of improving things but in reality its probably going to create more blindspots where product engineers would hand out the responsibility to security engineers and they do not have much of an idea about the product flows
This is crucial detail that almost everyone misses when they are skimming the topic on surface. The implication is that this statement/law is referenced more often to shut down the architecture designs/discussions
Its hard to make it useful. Maybe I am not the right audience for this type of content, or I was trying to find something concrete in it related to my own experience
What is being phrased as obscurity is one of the approaches to security as long as you are able to keep the code safe. Your passwords, security keys are just random combination of strings, the fact that they are obscure from everyone is what provides you the security
Decompilation and you are back to the level of security you started with. OpenSSH is open for a good reason. Please acknowledge your error. Are you AI?
OTOH, their position seems to be "many LLMs make shallow bugs" is unhelpful; same as many eyes make shallow bugs considered unhelpful.
What seems genuinely needed by the open source economy to both surface these latent vulns and tamp down finding-slop is a new https://bughook.github.com/your/repo/ that these big LLMs (Mythos, etc.) support. Mythos understands if it's been used to find an vuln, and back end auto-reports verified findings the git service can feed to a Dependabot type tool.
Even better, price up Mythos to cover running a background verifier that gets the project, revalidates the issue, before that bughook.
Meanwhile, train it on these findings, so its future self doesn't create them.
Every change would introduce the possibility of a vulnerability being added to the system and one would need to run the LLM scan across the entire code base. It gets very costly in a environment where you are doing regular commits. Companies like Github already provide scanning tools for static analysis and the cost is already high for them.
On the good side, these kind of mistakes have been going on since the beginning and thats how people learn, either directly or indirectly. Hopefully this should at least help AI to be better and the people to be better at using AI
reply