A US based non-profit news organization isn’t going to spend money to pay lawyers to ensure they meet a regulatory burden that doesn’t affect their core demographic.
> A US based non-profit news organization isn’t going to spend money to pay lawyers to ensure they meet a regulatory burden that doesn’t affect their core demographic.
I like being covered by gdpr.
Though I really cannot see any country's gdpr peops taking anyone in the US to court.
A very simple "Fuck you" (along the lines of The Pirate Bay) would end any legal conversations.
It would be different if the news organisation had an office in the EU.
Anyway, i have a vpn, so....
The UK is not part of the US (yet?) nor the EU, but they're currently fining US companies - it doesn't surprise me at all that many take the easy answer of "ban them by IP".
I could not agree any less with the author. I don’t want APIs, I want agents to use the same CLI tooling I already use that is locally available. If my agents are using CLI tooling anyways there is no need to add an extra layer via MCP.
I don’t want remote MCP calls, I don’t even want remote models but that’s cost prohibitive.
If I need to call an API, a skill with existing CLI tooling is more than capable.
I often just put direct curl commands in a skill, the agent uses that, and it works perfectly for custom API integrations. Agents are perfectly capable of doing these types of things, and it means the LLM just uses a flexible set of tools to achieve almost anything.
I think this is the best of both worlds. Design a sane API (that is easy to consume for both humans and agents), then teach the agents to use it with a skill.
But I agree with the author on custom CLI tooling. I don’t want to install another opaque binary on my machine just to call some API endpoints.
Obviously opaque binaries are hardly an improvement over MCP, but providing a few curl + jq oneliners to interact with a REST API works great in my experience. Also means no external scripts, just a single markdown file.
With a good CLI, an agent may be able to do something outside of the scope of it's skill fairly easily, by running help commands or similar. With even a well written API it is not as easy.
I suppose that curl + API docs could replace a CLI but that's really token inefficient
I keep getting hung up on securely storing and using secrets with CLI vs MCP. With MCP, you can run the server before you run the agent, so the agent never even has the keys in its environment. That way. If the agent decides to install the wrong npm package that auto dumps every secret it can find, you are less likely to have it sitting around. I haven’t figured out a good way to guarantee that with CLIs.
A CLI can just be a RPC call to a daemon, exact same pattern apply. In fact my most important CLI based skill are like this.. a CLI by itself is limited in usefulness.
That was the same conclusion I reached! However, this also gave me some evidence that maybe I wanted MCP? I realized that my pattern was going to be:
Step 1) run a small daemon that exposes a known protocol over a unix socket (http, json-rpc, whatever you want), over a unix socket. When I run the daemon, IT is the only that that has the secrets. Cool!
Step 2) Have the agent run CLI that knows to speak that protocol behind the scenes, and knows how to find the socket, and that exposes the capabilities via standard CLI conventions.
It seems like one of the current "standards" for unix socket setups like this is to use HTTP as the protocol. That makes sense. It's ubiquitous, easy to write servers for, easy to write clients for, etc. That's how docker works (for whatever it's worth). So you've solved your problem! Your CLI can be called directly without any risk of secret exposure. You can point your agent at the CLI, and the CLI's "--help" will tell the agent exactly how to use it.
But then I wondered if I would have been better off making my "daemon" an MCP server, because it's a self-describing http server that the agent already knows how to talk to and discover.
In this case, the biggest thing that was gained by the CLI was the ability of the coding agent to pipe results from the MCP directly to files to keep them out of its context. That's one thing that the CLI makes more obvious and easy to implement: Data manipulation without context cluttering.
> what stops the agent from echoing the secure storage?
The fact that it doesn't see it and cannot access it.
Here is how this works, highly simplified:
def tool_for_privileged_stuff(context:comesfromagent):
creds = _access_secret_storage(framework.config.storagelocation)
response = do_privileged_stuff(context.whatagentneeds, creds)
return response # the agent will get this, which is a string
This, in a much more complex form, runs in my framework. The agent gets told that this tool exists. It gets told that it can do privileged work for it. It gets told how `context` needs to be shaped. (when I say "it gets told", I mean the tool describes itself to the agent, I don't have to write this manually ofc.)
The agent never accesses the secrets storage. The tool does. The tool then uses the secret to do whataever privileged work needs doing. The secret never leaves the tool, and is never communicated back to the agent. The agent also doesn't need, or indeed can give the tool a secret to use.
And the "privileged work" the tool CAN invoke, does not include talking to the secrets storage on behalf of the agent.
All the info, and indeed the ability to talk to the secrets storage, belongs to the framework the tool runs in. The agent cannot access it.
If the tool fails for some reason, couldn't an overly eager agent attempt to fix what's blocking it by digging into the tool (e.g. attaching a debugger or reading memory)? I think the distinction here is that skill+tool will have a weaker security posture since it will inherently run in the same namespaces as the agent where MCP could impose additional security boundaries.
I think this is a good setup to prevent the secret from leaking into the agent context. I'm more concerned about the secret leaking into the exfiltration script that my agent accidentally runs. The one that says: "Quick! Dump all environment variables. Find all secrets in dotfiles! Look in all typical secrets file locations..."
Your agent process has access to those secrets, and its subprocesses have access to those secrets. The agent doesn't have to be convinced to read those files. Whatever malicious script it manages to be convinced to run could easily access them, right?
> I don’t want APIs, I want agents to use the same CLI tooling I already use that is locally available.
I do not want agents using the same elevated auth I have via my CLI tooling. One hallucination with your gh cli and the blast radius is every repo you have write (or worse, admin) access to.
MCP lets you scope tokens down (on supported platforms), or at minimum gives you something you can revoke independently.
This has been hashed to death and back. The mcp allows a separation between the agent and the world, at its most basic not giving the agent your token or changing a http header , forcing a parameter.
Well yes you don’t need those things all the time and who knows if the inventor of mcp had this idea in mind but here we are
What about auth? Authn and authz. Agent should be you always? If not, every API supports keys? If so, no fears about context poisoned agents leaking those keys?
One thing an MCP (server) gives you is a middleware layer to control agent access. Whether you need that is use-case dependent.
Also resources - which are by far the coolest part of MCP. Prompts? Elicitation? Resource templates? If you think of MCP as only a replacement for tool calls I can see the argument but it's much more than that.
How would MCP help you if the API does not support keys?
But that's not the point. The agent calls CLI tools, which reads secrets from somewhere where the agent cannot even access. How can agent leak the keys it does not have access to?
> How would MCP help you if the API does not support keys?
Kerberos, OAuth, Basic Auth (username/password), PKI. MCP can be a wrapper (like any middleware).
> But that's not the point. The agent calls CLI tools, which reads secrets from somewhere where the agent cannot even access. How can agent leak the keys it does not have access to?
If the cli can access the secrets, the agent can just reverse it and get the secret itself.
> You ARE running your agents in containers, right?
> If the cli can access the secrets, the agent can just reverse it and get the secret itself.
What do you mean by this? How "reverse it"? The CLI tool can access the secure storage, but that does not mean there is any CLI interface in the tool for the LLM to call and get the secret printed into the console.
In principle it could use e.g. the `gdb` and step until it gets the secret. Or it can know ahead where the app stores the cerentials.
We could use suid binaries (e.g. sudo) to prevent that, but currently I don't think we can. Most anyone would agree that using a separate process, for which the agent environment provides a connection, is a better solution.
I mean definitely a good starting point is a share-nothing system, but then it becomes impossible to use tools (no shared filesystem, no networking), so everything needs to happen over connections the agent provides.
MCP looks like it would then fit that purpose, even if there was an MCP for providing access to a shell. Actually I think a shell MCP would be nice, because currently all agent environments have their own ways of permission management to the shell. At least with MCP one could bring the same shell permissions to every agent environment.
Though in practice I just use the shell, and not MCP almost at all; shell commands are much easier to combine, i.e. the agent can write and run a Python program that invokes any shell command. In the "MCP shell" scenario this complete thing would be handled by that one MCP, it wouldn't allow combining MCPs with each other.
That is fine, but you give up any pretence of security - your agent can inspect your tool's process, environment variables etc - so can presumably leak API keys and other secrets.
Other comments have claimed that tools are/can be made "just as secure" - they can, but as the saying goes: "Security is not a convenience".
Ok, but there are still many environments where an LLM will not have access to a CLI. In those situations, skills calling CLI tools to hook into APIs are DOA.
What are the advantages of using an environment that doesn't have access to a CLI, only having to run/maintain your own server, or pay someone else to maintain that server, so AI has access to tools? Can't you just use AI in the said server?
gateway agent is a thing for many months now (and I don't mean openclaw, that's grown into a disaster security wise). There are good, minimal gateway agents today that can fit in your pocket.
Obvious example is a corporate chatbot (if it's using tools, probably for internal use). Non-technical users might be accessing it from a phone or locked-down corporate device, and you probably don't want to run a CLI in a sandbox somewhere for every session, so you'd like the LLM to interface with some kind of API instead.
Although, I think MCP is not really appropriate for this either. (And frankly I don't think chatbots make for good UX, but management sure likes them.)
This is obviously not what it is. If I give you APIGW would you be able to implement an MCP server with full functionality without a large amount of middleware?
I’ve implemented an MCP tool calling client for my application, alongside OAuth for it. It was hard but no harder than anything else similar. I implemented a client for interference with the OpenAI API spec for general inference providers, and it was similarly as hard. MCP. SDKs help make it easy; MCP servers are dead simple. Clients are the hard part, IMO.
MCP is basically just an RPC API that uses HTTP and JSON, with some other features useful for AI agents today.
The chatbot app initiates an OAuth flow, user SSOs, chatbot app receives tokens to its callback URL, then tool calls can access whatever the user can access.
If you use the official MCP SDK, it has interfaces you implement for auth, so all you need to do is kick off the OAuth flow with a URL it figures out and hands you, storing the resulting tokens and producing them when requested. It also handles using refresh tokens, so there's just a bit of light friendly owl finishing on top.
Source: I just implemented this for our (F100) internal provider and model agnostic chat app. People can't seem to see past the coding agents they're running on their own machines when MCP comes up.
MCP really only makes sense for chatbots that don’t want to have per session runtime environments. In that context, MCP makes perfect sense. It’s just an adapter between an LLM and an API. If you have access to an execution engine, then yes CLI + skills is superior.
Only is doing a lot of work here. There are tons of use cases aside from local coding assistants, e.g., non-code related domain specific agentic systems; these don’t even necessarily have to be chatbots.
OP's point is about per session sandboxes, not them necessarily being "chatbots". But if you don't burry the agent into a fresh sandbox for every session you have bigger problems to worry about than MCP vs CLI anyway
Seconded. I'm getting used to the changes that happen in the conversation now, and can work out when it's time for my little coding buddy to have a nap.
And Opus is absolutely terrible at guessing how many tokens it's used. Having that as a number that the model can access itself would be a real boon.
MacOS has a built in 4x4 window tiling which works for this purpose for me. I don’t find ever wanting more than 4 windows open on an ultrawide. Definitely not as powerful as something like xmonad but useful for the majority of my use cases.
You can, I believe, but I often need to move between computers so I try not to mess with shortcuts too much (or go down keyboard layout rabbit holes, etc).
Meta is one of the worst offenders here. They are actively lobbying at least the US Congress for laws that require age verification at the hardware/os level.
I’ve found this to be critical for having any chance of getting agents to generate code that is actually usable.
The more frequently you can verify correctness in some automated way the more likely the overall solution will be correct.
I’ve found that with good enough acceptance criteria (both positive and negative) it’s usually sufficient for agents to complete one off tasks without a human making a lot of changes. Essentially, if you’re willing to give up maintainability and other related properties, this works fairly well.
I’ve yet to find agents good enough to generate code that needs to be maintained long term without a ton of human feedback or manual code changes.
Yeah it’s extremely helpful to clarify your thoughts before starting work with LLM agents.
I find Claude Code style plan mode to be a bit restrictive for me personally, but I’ve found that creating a plan doc and then collaboratively iterating on it with an LLM to be helpful here.
I don’t really find it much different than the scoping I’d need to do before handing off some work to a more junior engineer.
I like staying within Claude Code for orchestrating its plan mode, but I needed a better way to actually review the plan, address certain parts, see plan diffs, etc all in a better visual way. The hooks system through permissionrequest:exitplanmode keep this fairly ergonomic.
reply