Seeing the confusion in the comments I want to provide some examples of situations where this might come up in a security or CTF context:
* You have a restricted shell or other way to execute a restricted set of commands or binaries, often with arbitrary parameters. You can use GTFOBins in interesting ways to read files, write files, or even execute commands and ultimately break out of your restricted context into a shell.
* Someone allowed sudo access or set the SUID bit on a GTFOBin. Using these tricks, you may be able to read or write sensitive files or execute privileged commands in a way the person configuring sudo did not know about.
This is pretty relevant for things like claude-code, which has a fairly rudimentary way of dealing with permissions with block-lists and allow-lists.
I once accidentally gave my claude "powershell" permissions in one session, and after that any time it found it was blocked from using a tool, e.g. git, it would write a powershell script that did the same thing and execute the script to work around the blocked permission.
Obviously no sane system would have "powershell" in a generic allow-list, but you could imagine some discrepancy in allowed levels between tools which can be worked around with the techniques on this page.
Power Shell or Python scripts to work around restrictions are the go to for LLMs.
And it doesn't stop there.
Yesterday I was trying to figure out some icons issue in KDE plasma (I know nothing about KDE). Both Claude and Codex would run complex bus and debug queries and write and execute QML scripts with more and more tools thrown into the mix.
There's no way to properly block them with just allow- and block lists
> There's no way to properly block them with just allow- and block lists
Especially not when some harnesses rely on the reliability of the LLM to determine what's allowed or not, pretty much "You shouldn't do thing X" and then asking the LLM to itself evaluate if it should be able to do it or not when it comes up. Bananas.
Only right and productive way to run an agent on your computer is by isolating it properly somehow then running it with "--sandbox danger-full-access --dangerously-bypass-approvals-and-sandbox" or whatever, I myself use docker containers, but there are lots of solutions out there.
You have to be extremely careful when you set up a dev container, lock down file access, do not give the agent the power to start other containers or "docker compose up", restrict network access to an allow-list etc. Just running the agent in a container does little to protect you. (Maybe you know this, but a lot of people don't!)
Most of those things are what happens by default. Sure, be careful, but by default it's secure enough to prevent most potential issues. No need to lock down file access for example, by default it only has access to files inside the container, and of course by default containers don't have access to start other containers, and so on.
Good word of caution though, make sure you actually isolate when you set out to isolate something :)
As mentioned, "podman/docker run -it $my-image codex" also actually has the requisite isolation by default, no need for special software. Biggest risk is accidental deletion of stuff, easily solved without running an entire VM, which "smol" machines seems to be. No doubt VMs have their uses too, but for simple isolation like this I personally rather use already existing tooling.
Ok, YMMV, but a smolvm provides macOS-native, per-workload isolation -- vs trad container depending on a daemon and relying on namespaces (w/ a shared kernel). Easy "packing" into single-file executables, and a nice SDK, make it ~ideal for my needs; great balance of security:convenience.
Cool ad bro, but stop claiming container won't get you "per workload isolation" just because they share kernels, in the context of this discussion it hardly matters, containers isolates enough for this.
I imagine someone probably wrote very specifically about it in the training data that underwent lossy compression, and the LLM is decompressing that how-to.
So I'd say it's more like "surfacing" or "retrieving" than "re-discovering".
They scraped everything on Stackoverflow, likely IRC logs from Freenode, and every book written in the modern era courtesy of Sci-Hub / Library Genesis / Anna's Archive / Z Library.
RIP Aaron Swartz, they're generating trillions in shareholder value from the spiritual successors to the work they were going to imprison you for.
For the LLM it's a probabilistic set of strings that achieves the outcome, the highest probability set didn't work, try the next one until success or threshold met. A human sees the implicit difference between the obvious thing not working indicating someone doesn't want you to do it, but an LLM unless guided doesn't seen that sub-text.
So chmod +x file didn't work, now try python -c "import os; os.chmod('file',744)"
Humans and LLMs both only see that when given the right context. A tool not working in a corporate environment may be anything from oversight, malfunction all the way to security block. Knowing which one it is takes a lot of implicit knowledge. Most people fail to provide this level of context to their LLMs and then wonder why they act so generic. But they are trained to act in the most generic way unless given context that would deviate from it.
> * Someone allowed sudo access or set the SUID bit on a GTFOBin. Using these tricks, you may be able to read or write sensitive files or execute privileged commands in a way the person configuring sudo did not know about.
Some enterprise security software that is designed to "mediate privilege elevation" includes an allowlist configured by the administrators. My experience seeing this rolled out at one company was that software on the allowlist no longer required a password to run with `sudo`. The allowlist initially included, of course, all kinds of broadly useful software that made its way onto this list (e.g., vim, bash).
I worked from home at this company, and I remember thinking it was a good thing, because this software deployed to "secure" my computer made it drastically weaker to someone walking up to it and trying to run something if I stepped away from the keyboard for a moment and forgot to lock it.
A few years back, our support team needed to do some network capture with tcpdump. The quick and natural way to allow that was to add a sudo rule for it, with opened arguments (I know it's a bit risky, but tcp port and nic could change).
Looks good enough? Well no...
With tcpdump, you can specify a compress command with the "-z" option. But nothing prevents you from running a "special" compress command and completely take over the server:
This seems trivial, but that the kind of stuff which are really easy to miss. Even if these days, security layers like apparmor mitigate this risk (causing a few headaches along the way), it's still relatively easy to mess it up.
Specifically for these kind of situations, sudo has the NOEXEC tag: it preloads a dummy library that null-routes all exec calls to prevent this kind of shell leak.
Lots of people here are (perhaps rightfully) pointing to the unwrap() call being an issue. That might be true, but to me the fact that a reasonably "clean" panic at a defined line of code was not quickly picked up in any error monitoring system sounds just as important to investigate.
Assuming something similar to Sentry would be in use, it should clearly pick up the many process crashes that start occurring right as the downtime starts. And the well defined clean crashes should in theory also stand out against all the random errors that start occuring all over the system as it begins to go down, precisely because it's always failing at the exact same point.
In the early 2000s when Google explained how they achieved their (already back then) awesome reliability, ie assuming that any software and hardware will eventually fail, and that they designed everything with the idea that everything was faulty, there were some people who couldn't get it, who would still bring the argument that "yeah but today with modern raid..."
People here chatting about unwrap remind me of them :)
Assuming software and people will fail is exactly what not using unwrap is about.
If you depend on engineers not fucking up, you will fail. Using unwrap is assuming humans won’t get human-enforced invariants wrong. They will. They did here.
As someone that works in formal verification of crypto systems, watching people like yourself advocate for hope-and-prayer development methodology is astonishing.
However, I understand why we’re still having this debate. It’s the same debate that’s been occurring for the same reasons for decades.
Doing things correctly is mentally more difficult, and so people jump through ridiculous rhetorical hoops to justify why they will not — or quite often, mentally cannot — perform that intellectual labor.
It’s a disheartening lack of craftsmanship and industry accountability, but it’s nothing new.
I do not understand what gave you the impression that I was advocating for "hope and prayers". I'm advocating for not relying on one level of abstraction to be flawless so we can build a perfect logic on top of it. I'm advocating for not handling everything in a single layer. That FL2 program at cloudflare encountered an error condition and it bailed out and that's fine. What is not fine is that the supervisor did not fail open.
The oposing views here are not "hope and prayers" vs "good engineering", it's assuming things will fail at every stage vs assuming one can build a layer of abstraction that is flawless, on top of which we can build.
Resilient systems trump "correct" systems, and I would pick a system designed under the assumption that fake errors will be injected regularly, that process will be killed at random, that entire rack of machines will be unplugged at random at any time, that whole datacenters will be put off grid for fun, over a system that's been "proven correct", any day. I though it was common knowledge.
Of coursre I'm not arguing against proving that a software is correct. I would actually argue that some formal methods would come handy to model these kind of systemic failures and reveal the worste cases with largest blast radius.
But considering the case at hand, the code for that FL2 bot had an assertion regarding the size of received data and that was a valid assertion, and the process decided to panic, and that was the right decision. What was not right was the lack of instrumentation that should have made these failures obvious, and the fact that the user queries failed when that non-essential bot failed, instead of bypassing that bot.
I work as a pentester. CSRF is not a problem of the user proving their identity, but instead a problem of the browser as a confused deputy. CSRF makes it so the browser proves the identity of the user to the application server without the user's consent.
You do need a rigid authentication and authorization scheme just as you described. However, this is completely orthogonal to CSRF issues. Some authentication schemes (such as bearer tokens in the authorization header) are not susceptible to CSRF, some are (such as cookies). The reason for that is just how they are implemented in browsers.
I don't mean to be rude, but I urge you to follow the recommendation of the other commenters and read up on what CSRF is and why it is not the same issue as authentication in general.
Clearly knowledgeable people not knowing about the intricacies of (web) security is actually an issue that comes up a lot in my pentesting when I try to explain issues to customers or their developers. While they often know a lot about programming or technology, they frequently don't know enough about (web) security to conceptualize the attack vector, even after we explain it. Web security is a little special because of lots of little details in browser behavior. You truly need to engage your suspension of disbelief sometimes and just accept how things are to navigate that space. And on top of that, things tend to change a lot over the years.
Of course CSRF is a form of authorisation; "should I trust this request? is the client authorised to make this request? i.e. can the client prove that it should be trusted for this request?", it may not be "logging in" in the classic sense of "this user needs to be logged into our user system before i'll accept a form submit request", but it is still a "can i trust this request in order to process it?" model. You can wrap it up in whatever names and/or mechanism you want, it's still a trust issue (web or not, form or not, cookie or not, hidden field or not, header or not).
Servers should not blindly trust clients (and that includes headers passed by a browser claiming they came from such and such a server / page / etc); clients must prove they are trustworthy. And if you're smart your system should be set up such that the costs to attack the system are more expensive than compliance.
And yes, I have worked both red team and blue team.
You say you should "never trust the client". Well trust has to be established somehow right, otherwise you simply cannot allow any actions at all (airgap).
Then, CSRF is preventing a class of attacks directed against a client you actually have decided to trust, in order to fool the client to do bad stuff.
All the things you say about auth: Already done, already checked. CSRF is the next step, protecting against clients you have decided to trust.
You could say that someone makes a CSRF attack that manages to change these headers of an unwitting client, but at that point absolutely all bets are off you can invent hypothetical attacks to all current CSRF protection mechanisms too. Which are all based on data the client sends.
(If HN comments cannot convince you why you are wrong I encourage you to take the thread to ChatGPT or similar as a neutral judge of sorts and ask it why you may be wrong here.)
Yes, this is documenting one particular way of doing CSRF. A specific implementation.
The OP is documenting another implementation to protect against CSRF, which is unsuitable for many since it fails to protect 5% of browsers, but still an interesting look at the road ahead for CSRF and in some years perhaps everyone will change how this is done.
And you say isn't OK, but have not in my opinion properly argued for why not.
It doesn't actually fail to protect 5%, as the top-line 5% aren't really "browsers". Even things like checkboxes often top out at around 95%!
You can change a setting on caniuse.com and it excludes untracked browsers. Sec-Fetch-Site goes up to 97.6, with remainder being a bit of safari (which will likely update soon) and some people still on ancient versions of chrome.
It's very complicated and ever evolving. It takes dedicated web app pentesters like you to keep up with it... back in the day, we were all 'generalists'... we knew a little bit about everything, but those days are gone. It's too much and too complicated now to do that.
SameSite=Lax (default for legacy sites in Chrome) will protect you against POST-based CSRF.
SameSite=Strict will also protect against GET-based CSRF (which shouldn't really exist as GET is not a safe method that should be allowed to trigger state changes, but in practice some applications do it). It does, however, also make it so users clicking a link to your page might not be logged in once they arrive unless you implement other measures.
In practice, SameSite=Lax is appropriate and just works for most sites. A notable exception are POST-based SAML SSO flows, which might require a SameSite=None cookie just for the login flow.
Yes, you're definitely right that there are edge cases and I was simplifying a bit. Notably, it's called SameSite, NOT SameOrigin. Depending on your application that might matter a lot.
In practice, SameSite=Lax is already very effective in preventing _most_ CSRF attacks. However, I 100% agree with you that adding a second defense mechanism (such as the Sec header, a custom "Protect-Me-From-Csrf: true" header, or if you have a really sensitive use case, cryptographically secure CSRF tokens) is a very good idea.
The article doesn't mention possible security implications. However, we already get lots of vulnerabilities exactly _because_ implementations disagree on delimiters. Examples for this are HTTP request smuggling[1, 2, 3] and SMTP smuggling[4].
As the references show, this is already a big source of vulnerabilities - trying to push for a change in standards would likely make the situation much worse. At the very least, old unmaintained servers will not change their behavior.
I think we should accept that this ship has sailed and leave existing protocols alone. Mandate LF and disallow CRLF in new protocols, that's fine, but I don't think we should open this particular Pandora's Box.
Or set `push.default` to `current` to have plain `git push origin` push to the same remote name, ignoring the configured upstream (you might also want to set `remote.pushDefault` alongside that).
Are there any resources you can recommend to understand D-wave's quantum computing a bit better?
I took a very basic course about gate-model quantum computing at my university. The (mathematics) professor would have loved to be able to explain adiabatic quantum computing on a basic level, but was unable to find entry-level material to really understand how it works or what problems it can solve.
* You have a restricted shell or other way to execute a restricted set of commands or binaries, often with arbitrary parameters. You can use GTFOBins in interesting ways to read files, write files, or even execute commands and ultimately break out of your restricted context into a shell.
* Someone allowed sudo access or set the SUID bit on a GTFOBin. Using these tricks, you may be able to read or write sensitive files or execute privileged commands in a way the person configuring sudo did not know about.
reply