This is usually caused by an insufficently seeded PRNG.
Are you generating the UUID in the backend, or the frontend? Frontend is fundamentally unreliable for many reasons, including deliberate collisions. So if that case you'll need to handle collisions somehow. Though you can still engineer around common sources of collisions, the specifics depend on the environment.
On the other hand making a backend reliable is feasible. What kind of environment is your code running in? Historically VMs sometimes suffered from this problem, though this should be solved nowadays. Heavily sandboxed processes might still run into this, if the RNG library uses an unsafe fallback. Forking processes or VMs can cause state duplication and thus collisions.
I remember hearing about Segment (analytics company) had their entire product based around UUIDs generated in web browsers. There were collisions all over the place, the product was seemingly incapable of producing useful data at a fundamental level because of it. Hopefully they've fixed that now.
For proprietary software, sure. But open source projects rarely ever work like this.
Especially for a project like the kernel, there's no reasonable way to decide who out of thousands of interested parties should have access first.
Android is a rare exception, as of a few years ago they started a program where phone manufacturers get very favorable early access to AOSP code 4 months ahead of public release.
likely what they are implying is that chargebacks have indirect costs that you can ballpark around $50 per chargeback. So steam would likely take back the $5 revenue from the developer for the $5 chargeback, but the costs of processing the chargeback are absorbed by steam. i do not know if they have a separate chargeback fee they charge developers for it but it wouldnt make sense to as steam is the one validating and processing payments
I'd try a modern file system with de-duplication/copy-on-write support. `cp` creates reflinks automatically if the file-system supports copy-on-write.
> Support for reflinks is indicated using the remap_file_range operation, which is currently (6.18) supported by bcachefs, Btrfs, CIFS, NFS 4.2, OCFS2, overlayfs, and XFS. Some external file systems support them too, including bcachefs and OpenZFS.
That's the part of your post I have trouble understanding. That you need to work around colliding ports suggests that the containers spun up by the agent run directly on the host, not inside some form of nested containerization. But if you do that, how do you ensure that the application running in those containers is sandboxed just as strictly as the agent itself?
The docker compose stack for the applications is spun up on the host. The agents have access to the docker socket which means they can talk to docker from inside their sandbox and spin up new sibling containers on the host. Yolobox isn’t designed for full isolation- just accidental commands you wouldn’t want to run on the host, and a convenient way of giving agents a customizable environment they control.
Early on in development I tried to harden the container to prevent deliberate escapes by the agent. This was a waste of time as the agents just kept finding more and more exploits when I asked them to try and break out.
I wouldn't assume that a VM will give you complete security against a determined AI. yolobox started as a way to prevent accidental `rm -rf ~` and has expanded into a set of tools that make working with CLI agents easier.
Personally, I run yolobox directly on the host. Being able to tell the agent it has sudo and can install and do whatever it needs to accomplish any task is handy.
Docker was only exposed later, after I realized that any sufficiently determined AI could break out of the container, and attempts to contain it were a waste of time. Also note that the docker socket is not exposed by default. There's a --docker flag for this.
I made some comments about exploits in the original post [1]. Gemini was quite creative in adding git hooks to the repo that would execute on the host machine. That folder is shared.
How well does Fedora handle proprietary software nowadays? For example the Nvidia driver, Steam, Rider or video codecs. I negatively remember their patent paranoia regarding elliptic curve cryptography.
My favourite feature of Manjaro (and presumably Arch) is how easily I can install almost any software from a single package manager (which supports the official repos, flatpak and AUR). While on Mint I had to mess with custom package sources, or install individual vendor provided packages which lacked auto-update.
There's still a bit of manual work involved to install the codecs (and proprietary drivers if you need em), which is why I would never recommend vanilla Fedora to a newbie - but Fedora derivatives exist to address that issue.
Ultramarine[1] is one such easy-to-use derivative, and for gamers there's Nobara[2] and Bazzite[3] (an immutable distro).
i've never really understood what bazzite offers that stock fedora does not. like steam works out of the box just fine on plain ole fedora 41, and my AMD card is supported without issue. runs CP2077 flawlessly.
literally, steam out of the box is just adding rpmfusion repos, which you're probably gonna do anyway if you want stuff like vlc or other tools
It's a lot more than just Steam, it's a custom kernel, custom CPU scheduler, additional drivers for game controllers, drivers for handheld devices and a bunch of other tweaks and tools (Bazaar, Lutris, MangoHUD, ujust scripts etc). But more than all that, the biggest draw for Bazzite is that it's an immutable distro with atomic updates, so updates "just work" and it's very very hard to break the system.
And in the rare event you get a bad update, you can just boot to the previous two images right from the boot menu, no need for any commands or restores - just boot the image and keep using it without any worries. You can pin known good images too, so you know for sure you always have a working image you can boot into. And you have access to the previous 90 days of images (via Github), so you can switch to any old image (or the latest beta) for bug/regression testing, without needing to do lengthy backups and restores.
All this makes it ideal for someone who just wants their system to work without worrying about updates and stuff, getting you a console-like experience on PC.
> It turns out too that Railway stores backups in the same volume.
That's probably not quite correct. I'd guess the snapshots are synchronized elsewhere (e.g. object storage). But the snapshots are logically owned by the volume resource, and deleting the volume deletes the associated snapshots as well. I think AWS EBS volumes behave like that as well.
The article seems to assume that this company added an endpoint for deleting the database. My reading of the original article was that the cloud provider offers an API to manage their resources, which includes an API to delete a volume.
The article proposes automation as the solution for such mistakes. But infrastructure automation tools like Terraform rely on the exact API that resulted in the database getting deleted.
IMO the biggest mistakes were:
1. Having an unrestricted API token accessible by AI. Apparently they were not aware that the token had that many permissions.
2. No deletion protection on the production database volume.
3. Deleting a volume immediately deletes all associated snapshots. Snapshot deletion should be delayed by default. I think AWS has the same unsafe default, but at least their support can restore the volume. https://alexeyondata.substack.com/p/how-i-dropped-our-produc...
AI wasn't the main issue (though it grabbing tokens from random locations is rather scary). But automation isn't the answer either, a Terraform misconfiguration could have just as easily deleted the database.
Their cloud provider needs to work on safe defaults (limited privileges and delayed snapshot deletion), and communicating more clearly (the user should notice they're creating an unrestricted token).
Are you generating the UUID in the backend, or the frontend? Frontend is fundamentally unreliable for many reasons, including deliberate collisions. So if that case you'll need to handle collisions somehow. Though you can still engineer around common sources of collisions, the specifics depend on the environment.
On the other hand making a backend reliable is feasible. What kind of environment is your code running in? Historically VMs sometimes suffered from this problem, though this should be solved nowadays. Heavily sandboxed processes might still run into this, if the RNG library uses an unsafe fallback. Forking processes or VMs can cause state duplication and thus collisions.
reply