More

TheTaytay · 2026-04-20T02:20:23 1776651623

Yes…mistakes are inevitable, and I get not expecting or demanding perfection. But the subtext here is that this is unlikely to be a mistake, and much more likely to be fraud.

There are incentives for these spreadsheets having the values that they do, and also there is no conceivable way that the values are correct, and on top of that, the most likely ways to get these values are to copy and paste large amounts of numbers, and even perturb some of them manually.

If you see this in accounting,(where there are also mistakes), it’s definitely fraud. (Awww man - we accidentally inflated our revenue and profit to meet expectations by accidentally duplicating numerous revenue lines and no one internally caught it! Dang interns!) If you see it in science, you ask the authors about it and they shrug and mumble a semi plausible explanation if you’re lucky? I can totally imagine a lab tech or grad student making a large copy paste mistake. I can’t imagine them making a series of them in such a way that it bolsters or proves the author’s claim AND goes completely undetected by everyone involved.

SubiculumCode · 2026-04-20T02:41:04 1776652864

well, in that case, its bad. Obviously.

TheTaytay · 2026-04-15T02:42:36 1776220956

I wish it was a more standard pattern to pull down a dataset and manipulate it or give the agent the ability to manipulate it!

michaellee8 · 2026-04-15T04:05:29 1776225929

doesn't claude code already store oversized output to disk and let the agent grep it?

TheTaytay · 2026-04-14T02:07:00 1776132420

This looks like a really nice pattern for exposing all allowed capabilities in one place. Are you using it? Looks like it could easily wrap a CLI too…

TheTaytay · 2026-04-13T15:40:39 1776094839

Yes you can!

ndr · 2026-04-13T15:53:04 1776095584

without running zellij on the remote machine? how?

kibwen · 2026-04-13T16:05:21 1776096321

I'm unclear what's being asked. Zellij is just a TUI-based terminal multiplexer like tmux and screen, you either run it locally and SSH within it to a remote machine, or SSH to a remote machine and run Zellij from within the remote connection.

Kim_Bruning · 2026-04-13T16:55:13 1776099313

I guess they mean 'have zellij hold your session when you log off/close controlling terminal'. (that would require zellij on remote)

TheTaytay · 2026-04-13T03:22:47 1776050567

Aren’t they saying that it’s 5minutes for things like subagents (that wouldn’t benefit from it?)

TheTaytay · 2026-04-12T20:56:57 1776027417

This should be the top comment. The OP misunderstands the change and has their LLM write an expose. The company responds with a well-reasoned explanation that it would actually cost MORE money if there was a global 1h default for ALL prompts. It gets downvoted and the pitchforks stay out because…I presume the words like “cache read likelihood” sounds like made up fluff to the audience, rather than an actual explanation?

jwitthuhn · 2026-04-13T06:40:54 1776062454

It only potentially saves money for people on API pricing, it exhausts tokens faster with no benefit for users on the Claude Code subscription. Those users had their cache TTL reduced from 1 hour to 5 minutes and are saving no money because they were not paying based on the cache time in the first place.

glenngillen · 2026-04-12T22:16:06 1776032166

Because it is made up fluff for this audience. There is a wall of data and evidence + anecdotes from many people pointing to the exact problem here and giving concrete examples of how this absolutely does cost more.

And an admittedly uncharitable TLDR on the response is: "yeah... but most users just ask one thing and barely use the product so they never need the cache. Also trust me bro".

Which sure, fine. I'm willing to bet is technically true. I'd also bet those users never previously came close to hitting their session limits given their usage because their usage is so low. But now people who were previously considered low to moderate users are hitting limits within minutes.

They may as well have just said "we've looked at the data and we're happy with this change because it's a performance improvement for people we make the most margin on. Sucks to be you".

TheTaytay · 2026-04-12T20:50:49 1776027049

Are there any serious papers or theories that postulate that DNA is the self replicating matter sent to colonize the galaxy? It appears to be quite adaptable to its environment and able to hold a surprising amount of encoded information.

TheTaytay · 2026-04-11T17:40:49 1775929249

Yes, but it’s also currently the best one. They have OCI compatible Mac VM images that are prebuilt. It’s quite good.

TheTaytay · 2026-04-10T13:39:21 1775828361

Thank you for this!

I am a big fan of Marimo and was trying to use it as my agent’s “REPL” a while back, because it’s naturally so good at describing its own current state and structure. It made me think that it would make a better state-preserving environment for the agent to work. I’m very excited to play with this.

akshayka · 2026-04-10T14:54:54 1775832894

Thanks for the kind words.

We've had the same thought, and are experimenting in this direction in the context of recursive language models.

Let us know if you have feedback!

TheTaytay · 2026-04-10T03:55:14 1775793314

I keep getting hung up on securely storing and using secrets with CLI vs MCP. With MCP, you can run the server before you run the agent, so the agent never even has the keys in its environment. That way. If the agent decides to install the wrong npm package that auto dumps every secret it can find, you are less likely to have it sitting around. I haven’t figured out a good way to guarantee that with CLIs.

Aperocky · 2026-04-10T04:11:30 1775794290

A CLI can just be a RPC call to a daemon, exact same pattern apply. In fact my most important CLI based skill are like this.. a CLI by itself is limited in usefulness.

TheTaytay · 2026-04-10T19:51:30 1775850690

That was the same conclusion I reached! However, this also gave me some evidence that maybe I wanted MCP? I realized that my pattern was going to be:

Step 1) run a small daemon that exposes a known protocol over a unix socket (http, json-rpc, whatever you want), over a unix socket. When I run the daemon, IT is the only that that has the secrets. Cool! Step 2) Have the agent run CLI that knows to speak that protocol behind the scenes, and knows how to find the socket, and that exposes the capabilities via standard CLI conventions.

It seems like one of the current "standards" for unix socket setups like this is to use HTTP as the protocol. That makes sense. It's ubiquitous, easy to write servers for, easy to write clients for, etc. That's how docker works (for whatever it's worth). So you've solved your problem! Your CLI can be called directly without any risk of secret exposure. You can point your agent at the CLI, and the CLI's "--help" will tell the agent exactly how to use it.

But then I wondered if I would have been better off making my "daemon" an MCP server, because it's a self-describing http server that the agent already knows how to talk to and discover.

In this case, the biggest thing that was gained by the CLI was the ability of the coding agent to pipe results from the MCP directly to files to keep them out of its context. That's one thing that the CLI makes more obvious and easy to implement: Data manipulation without context cluttering.

linkregister · 2026-04-10T06:52:22 1775803942

In other words, a wrapper around an MCP that's less verbose.

throwup238 · 2026-04-10T10:39:28 1775817568

MCP is a wrapper around it. The CLI-daemon RPC pattern is much older and is used all over the place in modern systems.

otabdeveloper4 · 2026-04-10T14:31:09 1775831469

"MCP" here is not needed.

usrbinbash · 2026-04-10T07:15:53 1775805353

And in a skill, I can store the secret in the skill itself, or a secure storage the skill accesses, and the agent never gets to see the secret.

Sure, if I want my agents to use naked curl on the CLI, they need to know secrets. But that's not how I build my tools.

lukewarm707 · 2026-04-10T09:28:57 1775813337

what stops the agent from echoing the secure storage?

what i see is that you give it a pass manager, it thinks, "oh, this doesn't work. let me read the password" and of course it sends it off to openai.

usrbinbash · 2026-04-10T15:40:06 1775835606

> what stops the agent from echoing the secure storage?

The fact that it doesn't see it and cannot access it.

Here is how this works, highly simplified:

    def tool_for_privileged_stuff(context:comesfromagent):
        creds = _access_secret_storage(framework.config.storagelocation)
        response = do_privileged_stuff(context.whatagentneeds, creds)
        return response # the agent will get this, which is a string

This, in a much more complex form, runs in my framework. The agent gets told that this tool exists. It gets told that it can do privileged work for it. It gets told how `context` needs to be shaped. (when I say "it gets told", I mean the tool describes itself to the agent, I don't have to write this manually ofc.)

The agent never accesses the secrets storage. The tool does. The tool then uses the secret to do whataever privileged work needs doing. The secret never leaves the tool, and is never communicated back to the agent. The agent also doesn't need, or indeed can give the tool a secret to use.

And the "privileged work" the tool CAN invoke, does not include talking to the secrets storage on behalf of the agent.

All the info, and indeed the ability to talk to the secrets storage, belongs to the framework the tool runs in. The agent cannot access it.

mvkg · 2026-04-10T17:24:44 1775841884

If the tool fails for some reason, couldn't an overly eager agent attempt to fix what's blocking it by digging into the tool (e.g. attaching a debugger or reading memory)? I think the distinction here is that skill+tool will have a weaker security posture since it will inherently run in the same namespaces as the agent where MCP could impose additional security boundaries.

TheTaytay · 2026-04-10T19:55:04 1775850904

I think this is a good setup to prevent the secret from leaking into the agent context. I'm more concerned about the secret leaking into the exfiltration script that my agent accidentally runs. The one that says: "Quick! Dump all environment variables. Find all secrets in dotfiles! Look in all typical secrets file locations..."

Your agent process has access to those secrets, and its subprocesses have access to those secrets. The agent doesn't have to be convinced to read those files. Whatever malicious script it manages to be convinced to run could easily access them, right?

jgilias · 2026-04-10T10:08:12 1775815692

OpenAI is not the worst it could or would send it to.

JamesSwift · 2026-04-15T14:06:08 1776261968

Just wrap it with a script that handles the auth for you and the AI doesnt realize auth is even needed. I put my creds in ~/.config/ and write bash wrappers that read those and proxy the values into the api calls as needed.