Hacker Newsnew | past | comments | ask | show | jobs | submit | dbmikus's commentslogin

Are there any good guides on how to write prompt files tailored to different agents?

Would also be interested in examples of a CLAUDE.md file that works well in Claude, but works poorly with Codex.


I think one of the main examples that i saw in a swyx article a while back is that using the sort of ALL CAPS and *IMPORTANT* language that works decently with claude will actually detune the codex models and make them perform worse. I will see if I can find the post

Every website needs to add the "friend or foe" system[0] so that I can mark bots to avoid their content and mark good posters so I can filter just to theirs.

[0]: https://hackersmacker.org/


This should be seperate from marking bots because what this really wil do is embed people into hearing only what they want, making discussion worse.

no, I truly do not want to read IHeartHitler88's opinion on jews, or donttreadonme09's bright opinions about how the economy would be better if we listened to Ayn Rand. I'll be very happy when they're out of my sight. If I want to have a miserable day, sure, I'll turn it off.

Fact of the matter is, most posts on the internet are already dogshit. Now they're also populated by AI, but the point stands. Most of what you will say online is at best useless.


>Most of what you will say online is at best useless.

If that is true, you are saying far too much.


I know, it hurts. Most of what I say in this website doesn't matter. Even if it did, it's about the same thing as screaming into the void. And it applies to you too.

The vast majority of what we post is vapid, useless bullshit.


On /. I would only mark obnoxious people as friends so I could see the friend-of-a-friend indicator and be cautious of anyone aligned with them.

From the article, it looks like they integrated with Docker because someone at Docker reached out about collaborating on the integration.

Regarding security, I think you need three things:

  1. You need the agent to run inside a sandbox.
  2. You need a safe perimeter or proxy that can apply deterministic filtering rules on what makes it into the AI agent's sandbox and the HTTP requests and responses that agent sends out from the sandbox.
  3. The bot should have its own email accounts, or maybe be configured to only send/read from certain email addresses
I'm working on a product that makes it as easy to spin up remote agent sandboxes as it is to git push and git pull. Then when we get that working well we're putting a proxy around each sandbox to let users control filtering rules.

I personally see a future where there are many different types of *Claws, coding agents, etc. and I think they need a new "operating system", so to speak.

Self-plug at the end: https://github.com/gofixpoint/amika. The OSS part of my startup, focused on sandbox coding agents right now :)

PS: I enjoyed the entropytown.com blog! bookmarking it


I like that it's all bash.

How does this compare with Codex's and Claude's built-in sandboxing?


Claude: can escape its sandbox (there are GitHub issues about this) and, when sandboxed, still has full read access to everything on your machine (SSH keys, API keys, files, etc.)

Codex: IIRC, only shell commands are sandboxed; the actual agent runtime is not.


Cool, thanks for explaining!


I've been working on an OSS project, Amika[1], to quickly spin up local or remote sandboxes for coding workloads. We support copy-on-write semantics locally (well, "copy-and-then-write" for now... we just copy directories to a temp file-tree).

It's tailored to play nicely with Git: spin up sandboxes form CLI, expose TCP/UDP ports of apps to check your work, and if running hosted sandboxes, share the sandbox URLs with teammates. I basically want running sandboxed agents to be as easy as `git clone ...`.

Docs are early and edges are rough. This week I'm starting to dogfood all my dev using Amika. Feedback is super appreciated!

FYI: we are also a startup, but local sandbox mgmt will stay OSS.

[1]: https://github.com/gofixpoint/amika


This is just a thin wrapper over Docker. It still doesn't offer what I want. I can't run macOS apps, and if I'm doing any sort of compilation, now I need a cross-compile toolchain (and need to target two platforms??).

Just use Docker, or a VM.

The other issue is that this does not facilitate unpredictable file access -- I have to mount everything up front. Sometimes you don't know what you need. And even then copying in and out is very different from a true overlay.


Appreciate the deets!

It sounds like a big part of your use case is to safely give an agent control of your computer? Like, for things besides codegen?

We're probably not going to directly support that type of use case, since we're focused on code-gen agents and migrating their work between localhost and the cloud.

We are going to add dynamic filesystem mounting, for after sandbox creation. Haven't figured out the exact implementation yet. Might be a FUSE layer we build ourselves. Mutagen is pretty interesting as well here.


> Or mainly to save the end user the hassle to set it up correctly

It's this.

Don't have a Dropbox moment ;) [1]

[1]: https://news.ycombinator.com/item?id=9224


Oh lots of people will not be comfortable with tmux approach. The anthropic feature makes sense. But it's Max only and doesn't work well according to other comments.

What I posted "just works".


I think the README could use a few real use-case examples. I understand it in an abstract sense, but not sure I understand the benefit vs plain-text communication, besides saving on token spend.


I'm going to be pedantic and note that iOS and Android both have the capability security model for their apps.

And totally agree that instead of reinventing the wheel here, we should just lift from how operating systems work, for two reasons:

1. there's a bunch of work and proven systems there already

2. it uses tools that exist in training data, instead of net new tools


App permissions in iOS and Android are both too coarse-grained to really be considered capabilities. Capabilities (at least as they exist in something like Eros or Capsicum) are more "You have access to this specific file" or "You can make connections to this specific server" rather than "You have access to files" and "You have access to the network". The file descriptor is passed in externally from a privileged process where the user explicitly decides what rights to give the process; there is no open() or connect() syscall available to user programs.


This seems neat in theory but it is very difficult to actually do in practice. For example, let's say that you are allowed to make connections to a specific server. First, you have to get everyone onboard with isolating their code at that granularity, which requires a major rewrite that is easy to short-circuit by just being lazy and allowing overly broad permissions. But even then "a server" is hard to judge. Do you look at eTLD+1 (and ship PSL)? Do you look at IP (and find out that everything is actually Cloudflare)? Do you distinguish between an app that talks to the Gmail API, and one that is trying to reach Firebase for analytics? It's a hard problem. Most OSes do have some sort of capabilities as you've mentioned but the difficulty is not making them super fine-grained but actually designing them in a way that they have meaning.


Yes, exactly. The implementation difficulties are why this idea hasn't taken the world by storm yet. Incentives are also not there for app developers to think about programming this way - it's much easier to just request a general permission and then work out the details later.

For the server ID, it really should be based on the public key of the server. A particular service should keep its private key secret, broadcast a public key for talking to it, and then being able to encrypt traffic that the server can decrypt defines a valid connection capability. Then the physical location of the server can change as needed, and you can intersperse layers of DDoS protection and load balancing and caching while being secure in knowing that all the intervening routers do not have access to the actual communication.


Should Google and YouTube share a key? How about Google LLC and Google UK Limited?


No, and both Google and YouTube should be multiple keys. YouTube, for example, should have separate capabilities for watching videos (ideally on a video-by-video basis); autoplaying; viewing metadata; posting comments; uploading videos; viewing playlists; editing playlists; signing up for YouTube Premium; and a number of other operations. That way, somebody who has granted their software agent the capability to play YouTube instructional videos doesn't risk having their credit card charged for YouTube Premium or ruining their reputation by posting questionable spam on everybody else's videos.

Setting up the economic incentives to encourage this is a hard problem. Right now, the disincentives to this are: 1) it adds engineering overhead to secure everything behind a different private key 2) it adds PM overhead to determine what the right granularity is and what potential adversarial avenues exist 3) it creates user confusion as they're bombarded with a list of fine-grained permissions 4) it prevents tech companies from cross-selling new features and impedes new feature discovery. There's basically no reason to do it other than protecting the user, and the user probably won't know they've been compromised until long after they start using the service.


One can sort of get there today combining something like attribute based access control, signed bearer tokens with attributes, and some sort of a degrees-of-delegability limiter that a bearer can pass along like a packet TTL.

Did you want it in rust?

- https://github.com/eclipse-biscuit/biscuit-rust

- https://www.biscuitsec.org/


The people that are good but dangerous drivers will drive well and safely during tests, so you won't catch them.


We need a consistently reliable public transit system before we tell people they can't drive for one reason or another.


Allow drink driving in places with no metro system? There are obviously lines to be drawn in what you allow, for the safety of others, regardless of the alternatives. That said, we can absolutely work on improving public transport at the same time. There's no reason to have to fully solve public transport before trying to tackle dangerous driving.


People don't need to drink. They do need to get to work, pick up the kids from school, buy groceries, etc.


> We need a consistently reliable public transit system

Like Waymo


You can market products that people need. A big part of this is explaining and educating someone about what your product does, another part is just getting the word out there. Every website homepage is more-or-less a marketing page.

If no one is marketing a product, then nobody knows about it.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: