Ok, but there are still many environments where an LLM will not have access to a CLI. In those situations, skills calling CLI tools to hook into APIs are DOA.
What are the advantages of using an environment that doesn't have access to a CLI, only having to run/maintain your own server, or pay someone else to maintain that server, so AI has access to tools? Can't you just use AI in the said server?
gateway agent is a thing for many months now (and I don't mean openclaw, that's grown into a disaster security wise). There are good, minimal gateway agents today that can fit in your pocket.
Obvious example is a corporate chatbot (if it's using tools, probably for internal use). Non-technical users might be accessing it from a phone or locked-down corporate device, and you probably don't want to run a CLI in a sandbox somewhere for every session, so you'd like the LLM to interface with some kind of API instead.
Although, I think MCP is not really appropriate for this either. (And frankly I don't think chatbots make for good UX, but management sure likes them.)
This is obviously not what it is. If I give you APIGW would you be able to implement an MCP server with full functionality without a large amount of middleware?
I’ve implemented an MCP tool calling client for my application, alongside OAuth for it. It was hard but no harder than anything else similar. I implemented a client for interference with the OpenAI API spec for general inference providers, and it was similarly as hard. MCP. SDKs help make it easy; MCP servers are dead simple. Clients are the hard part, IMO.
MCP is basically just an RPC API that uses HTTP and JSON, with some other features useful for AI agents today.
The chatbot app initiates an OAuth flow, user SSOs, chatbot app receives tokens to its callback URL, then tool calls can access whatever the user can access.
If you use the official MCP SDK, it has interfaces you implement for auth, so all you need to do is kick off the OAuth flow with a URL it figures out and hands you, storing the resulting tokens and producing them when requested. It also handles using refresh tokens, so there's just a bit of light friendly owl finishing on top.
Source: I just implemented this for our (F100) internal provider and model agnostic chat app. People can't seem to see past the coding agents they're running on their own machines when MCP comes up.
MCP really only makes sense for chatbots that don’t want to have per session runtime environments. In that context, MCP makes perfect sense. It’s just an adapter between an LLM and an API. If you have access to an execution engine, then yes CLI + skills is superior.
Only is doing a lot of work here. There are tons of use cases aside from local coding assistants, e.g., non-code related domain specific agentic systems; these don’t even necessarily have to be chatbots.
OP's point is about per session sandboxes, not them necessarily being "chatbots". But if you don't burry the agent into a fresh sandbox for every session you have bigger problems to worry about than MCP vs CLI anyway