> What’s become more fun is building the infrastructure that makes the agents effective.
Solving new problems is a thing engineers get to do constantly, whereas building an agent infrastructure is mostly a one-ish time thing. Yes, it evolves, but I worry that once the fun of building an agentic engineering system is done, we’re stuck doing arguably the most tedious job in the SDLC, reviewing code. It’s like if you were a principal researcher who stopped doing research and instead only peer reviewed other people’s papers.
The silver lining is if the feeling of faster progress through these AI tools gives enough satisfaction to replace the missing satisfaction of problem-solving. Different people will derive different levels of contentment from this. For me, it has not been an obvious upgrade in satisfaction. I’m definitely spending less time in flow.
If you save 3 hours building something with agentic engineering and that PR sits in review for the same 30 hours or whatever it would have spent sitting in review if you handwrote it, you’re still saving 3 hours building that thing.
So in that extra time, you can now stack more PRs that still have a 30 hour review time and have more overall throughput (good lord, we better get used to doing more code review)
This doesn’t work if you spend 3 minutes prompting and 27 minutes cleaning up code that would have taken 30 minutes to write anyway, as the article details, but that’s a different failure case imo
> So in that extra time, you can now stack more PRs that still have a 30 hour review time and have more overall throughput
Hang on, you think that a queue that drains at a rate of $X/hour can be filled at a rate of 10x$X/hour?
No, it cannot: it doesn't matter how fast you fill a queue if the queue has a constant drain rate, sooner or later you are going to hit the bounds of the queue or the items taken off the queue are too stale to matter.
In this case, filling a queue at a rate of 20 items per hour (every 3 minutes) while it drains at a rate of 1 item every 5 hours means that after a single day, you can expect your last PR to be reviewed in ((8x20) - 1) hours.
IOW, after a single day the time-to-review is 159 hours. Your PRs after the second day is going to take +300 hours.
This is the fundamental issue currently in my situation with AI code generation.
There are some strategies that help: a lot of the AI directives need to go towards making the code actually easy to review. A lot of it it sits around clarity, granularity (code should be committed primarily in reviewable chunks - units of work that make sense for review) rather than whatever you would have done previously when code production was the bottleneck. Similarly, AI use needs to be weighted not just more towards tests, but towards tests that concretely and clearly answer questions that come up in review (what happens on this boundary condition? or if that variable is null? etc). Finally, changes need to be stratified along lines of risk rather than code modularity or other dimensions. That is, if a change is evidently risk free (in the sense of, "even if this IS broken it doesn't matter) it should be able to be rapidly approved / merged. Only things where it actually matters if it wrong should be blocked.
I have a feeling there are whole areas of software engineering where best practices are just operating on inertia and need to be reformulated now that the underlying cost dynamics have fundamentally shifted.
>Finally, changes need to be stratified along lines of risk rather than code modularity or other dimensions.
Why don't those other dimensions, and especially the code modularity, already reflect the lines of business risk?
Lemme guess, you cargo culted some "best practices" to offload risk awareness, so now your code is organized in "too big to fail" style and matches your vendor's risk profile instead of yours.
> Why don't those other dimensions, and especially the code modularity, already reflect the lines of business risk?
I guess the answer (if you're really asking seriously) is that previously when code production cost so far outweighed everything else, it made sense to structure everything to optimise efficiency in that dimension.
So if a change was implemented, the developer would deliver it as a functional unit that might cut across several lines of risk (low risk changes like updating some CSS sitting along side higher risk like a database migration, all bundled together). Because this was what made it fastest for the developer to implement the code.
Now if AI is doing it, screw how easy or fast it is to make the change. Deliver it in review chunks.
Was the original method cargo culted? I think most of what we do is cargo culted regardless. Virtually the entire software industry is built that way. So probably.
If your team's bottleneck is code review by senior engineers, adding more low quality PRs to the review backlog will not improve your productivity. It'll just overwhelm and annoy everyone who's gotta read that stuff.
Generally if your job is acting as an expensive frontend for senior engineers to interact with claude code, well, speaking as a senior engineer I'd rather just use claude code directly.
And when the PR you never even read because the AI wrote it gets bounced back you with an obscure question 13 days later ..... you're not going to be well positioned to respond to that.
Goody | Remote | $150–250K + equity and benefits | North/South America | Full-time
I'm Mark, the technical co-founder and CTO at Goody. We're building a gifting product that every business can use to recognize employees, retain customers, and accelerate sales. Despite being something everyone does, gifting is one of the areas of commerce yet to be disrupted, and we're working on building the best and most delightful product in this space.
Our product is used by Google, Stripe, Anthropic, Meta, NBCUniversal, Notion, and others, and we also offer a developer API for commerce. Tech stack is Ruby + React + TypeScript, though we're flexible on backend language if you know Python or Node.js better. All roles are full-stack.
We're coming off of a big year and planning for scale in 2026. We have a few new roles to accelerate our growth.
• Staff Software Engineer ($200–250K) — for those who ship at a startup pace and have a great eye for detail
• Senior Software Engineer, Customer Engineering ($150–200K) — if you like to hear a customer request in the morning and tell them it’s shipped in the afternoon. US and Canada only for this one
• Senior Software Engineer, Growth ($150–200K) — be the engineer who has the most direct impact on our growth
We're looking for people who have great startup energy, want to win, and bring great vibes to our tight-knit team.
Your position detail pages apparently require WebGL to prevent them from "crashing." That and "complex SPA" in the text (found from reader mode) tell me you might very well be overengineered.
I'd say it is quite opposite, a deep understanding of what you like and consequently understanding what will make a creation into exactly what you like. (Well I guess some people can create without understanding, just directly expressing their likes)
Since many of our likes are driven by our shared culture and physiology, many other people will appreciate such creation (even if they don't understand why exactly they like it). Others will appreciate depth of nuance and uniqueness of your creation.
Opposite to taste is approximated "good" average which is likeable but just never hits all the right notes, and at the same time already suffering from sameness fatigue.
It's subjective in the philosophical sense (the subject of the predication is involved with the judgment itself) but that doesn't mean it can't be "right" (and probably more importantly, "wrong").
Everyone who has built software knows that the hardest parts involve making complex, tricky decisions with tradeoffs. Let’s say you make a grocery list app. Now you have to make decisions about all the different ways to specify quantity. Units, weight, dollars, bunches… oh, and fractional vs. decimal weight, etc…
The claim is that now every random person now will build their own app and have to make those hard decisions instead of paying $5 a month for someone else to do that work. Comparative advantage doesn’t just apply to the cost of writing code, but also the effort of making product decisions.
Edit: I don’t mean that a grocery app should cost $5/month, the grocery app was a toy example and the $5/month refers to an example of a separate app you’d pay for with much more value.
This thread hits very close to home for me. I'm engineering the frontend for a grocery list app as a capstone project right now and I'm handling a lot of the product and feature decisions, and the discussion about "just prompt Claude to build it" versus the reality of those decisions is something my team deals with constantly.
The example of reverse-engineering your grocery store's API and building a custom solution is awesome, and it's exactly the kind of thing that's now possible. But what I've found is that even with AI assistance, there are so many interconnected decisions that make this more than a one-shot prompt project.
I pushed for us to build a mobile app specifically to take advantage of portability (use it at home for planning, at the store for shopping) and the camera (image recognition with OpenAI and scanning barcodes with expo-camera). That sounds simple, but it cascades into hundreds of UX decisions about offline-first architecture, gesture patterns, camera permissions, and more.
The units and quantities problem mentioned in this thread is just the tip of the iceberg. I'm trying to figure out a data model that mirrors how people naturally think about groceries: how they categorize items, how they plan meals versus staples versus impulse buys, how they track what's running low. Modeling those mental models is genuinely hard.
What helps is that I worked as an ecommerce shopper at Whole Foods, and I learned that stores are meticulously organized with numbered bays and predetermined routes optimized for efficiency. Translating that knowledge into a system that can intelligently sort a shopping list based on store layout (which varies by location!) and typical shopping patterns is genuinely complex.
One of my teammates put it well: this is a simple idea, but it requires a level of care, expertise, and experience to get it right. AI's incredibly helpful for implementing solutions once we've made these decisions, but the decisions themselves require domain knowledge, user research, and taste. That's the part that's hard to automate, and it's what makes this a real engineering project rather than a weekend Claude experiment.
Some things are just not suited to an app. It's still easier to jot down a shopping list on a piece of paper than to use an app and a janky mobile phone keyboard. And bonus, nobody gets to sell your shopping preferences or blast you with ads as you're trying to use it!
I spent years trying to find the PERFECT pantry tracking, auto shopping list generating, auto "what can I make tonight with what I have", auto meal preping app. The idea seemed so simple in mind back then. Let me input everything I have, then as I pull ingredients out of the fridge I just "decrease eggs by 3, decrease butter by 1tbsp, decrease bacon by 2 slices" then over time, it will just build my shopping list for me etc. I even built a requirement list and spent a year implementing my own thing.
Given the number of apps put there, from dozens of OSS hobbyist apps to industrial resturant inventory management ones, I wan't alone in thinking this is a solved problem and someone should just have the perfect interface for it. Between auto-unit converting apps, natural language processing apps, @cooklang, a million ideas about tracking pantries and ingredients and their categories, frequency of use charts, etc..
Then one time I went on a trip with a friend to his home town where we stayed at his parents house. His 78 year old mother had a 2 notepads attached to the fridge with a pencil on a string. As she worked in the kitchen, between washing hands she would just jot down random notes, cross others, doddles some on one notepad, and the other she would just add meal plans as she went along. Then when we were going to market she just ripped the page off.
Sounds so fucking simple and easy and I felt so stupid for the amount of effort I put trying to figure out the right app, the right device to mount on my fridge, how to connect power to it. How to make it not always on to blind me at night, but also so I don't have to keep fiddling with it to unlock it. how to use it with wet fingers, how to keep translating units and "catch up" when I miss updating it for a couple of meals, how to hide ingredients I don't care about and highlight ones I do, how to rearrange the interface. It seriously gave me a pause at how dumb I was that the solution is much much simpler and I pigeon holed my thinking on a tech solution for some reason.
Can't sell people notepads though. There is no margin or lock-in in that stuff.
I do hear what you're saying, and I've wrestled with "not everything should / can be an app". That being said, I'm still trying to solve food (for myself) with computers, haha.
Right now, that looks like trying to create a nutritionally-optimal "dog food for humans", using combinatorial optimization solvers. I think I'm going to write something up as a post when it becomes a bit more feature-complete.
It's living at chow.seanjohnsen.com if you're curious! Would love feedback from someone who has thought along these lines.
A grocery list app is the perfect example of the kind of thing that AI will make obsolete. Why would I pay $5/month for a list app when I can pay Claude $0.30 one time to make it for me?
I in fact did just that. I used Claude to reverse engineer my grocery store's API and build a grocery list app that automatically pulls in the aisle information for each item and sorts it by how I typically walk through the store. It's the kind of thing that would be incredibly difficult to scale but works just fine when you only have one user. No SaaS grocery app can hope to compete with me being able to tailor my own shopping list app to my exact preferences.
Your grocery store has a free API you can use? Even if that is the case, that will then soon change. If app building becomes "free" then the cost will shift over to the data access.
That is exactly the type of awesome app that can now be built. I edited my comment to clarify that the grocery app and $5/month app are separate examples, but I think your example shows that someone with coding knowledge can build something extremely useful for n=1 users which I fully support.
I just don’t think most people will end up doing that just like how most people don’t 3D print their own desk drawer organizers even when Gridfinity does all the work for you. Automation doesn’t fully replace the volition to build a thing and make tricky decisions that are familiar to us software engineers but not others.
Exactly. I think only software developers believe AI is going to kill app subscriptions, because they're the ones who can actually wrangle the output into something maintainable.
For anyone without dev or product experience, getting beyond a basic feature set and keeping it running reliably (or roll it out to > 1 user) is still a massive challenge.
Yes, that’s a possibility! And for app types that have a limited ceiling of how much value they can provide, that will definitely be a thing as an AI app can saturate all of that value.
But for apps that have a lot of ceiling, people will still gravitate to apps that have had more care and attention than someone vibe coding it once and throwing it on the store, just like how people choose those well-built and maintained apps today over using their built-in Reminders app.
There’s a difference between abstracting away the network layer and not understanding the business logic. What we are talking about with AI slop is not understanding the business logic. That gets really close to just throwing stuff at the wall and seeing what works instead of a systematic, reliable way to develop things that have predictable results.
It’s like if you are building a production line. You need to use a certain type of steel because it has certain heat properties. You don’t need to know exactly how they make that type of steel. But you need to know to use that steel. AI slop is basically just using whatever steel.
At every layer of abstraction in complexity, the experts at that layer need to have a deep understanding of their layer of complexity. The whole point is that you can rely on certain contracts made by lower layers to build yours.
So no, just slopping your way through the application layer isn’t just on theme with “we have never known how the whole system works”. It’s ignoring that you still have a responsibility to understand the current layer where you’re at, which is the business logic layer. If you don’t understand that, you can’t build reliable software because you aren’t using the system we have in place to predictably and deterministically specify outputs. Which is code.
This is not hype-chasing. AI is a key part of software engineering now. For this to be absent from Xcode would be an existential risk for the future of the product.
It most certainly is not, lol. That's the hype that the parent was referring to. Most people have found AI to be a detriment, not a benefit, to their work.
Millions in marketing efforts? Anyways, it may be a key part in generating code, but that was always a lesser part of software engineering. If it's generating code it doesn't mean it is doing any engineering for you or becoming a "key part" of it in any way.
No, it isn’t. There are irresponsible voices in the community who claim that it is, but they always find convenient ways to omit the downsides (on both the tech and effects on society as a whole).
Claude Code from the terminal is servicable enough. Yet I cannot open the same project from different versions of Xcode without some manual finnagling. Xcode is at no existential risk for it is the only tool you are allowed to use to reach your audience on the app store. Don’t be ridiculous. The reason Xcode is as broken as it is today is because of the same exact reason. The developer experience need not be great, as long as you can coax the trash fire of a toolchain to upload a signed app to AppStoreConnect, there is 0 incentive for Apple to put any time into the tool.
> On our project, it's still useless because it can't use the semantic search in the IDE.
Zed's ACP seems to be a good solution to this - when using it, claude code has access to the IDE's diagnostics and tools, just like the human operator. https://zed.dev/acp
Sure, it's a dumpster fire. But human engineers work on it just fine without investing man-decades into refactoring it into some shrine to the software engineer's craft.
The whole point of AI, in our parent company's eyes, is for no one to mention "code quality" as something impeding the delivery of features, yesterday, ever.
Claude, with a modicum of guidance from an engineer familiar with your monolith, could could write comprehensive unit tests of your existing system, then refactor it into coherent composable parts, in a day.
Not doing so while senior management demands the use of AI augmentation seems odd.
TUI is easy to train on, but hard to use for users. Part of the reason it’s easier to have LLMs use a bunch of Unix tools for us is that their text interface is tedious and hard to remember. If you’re a top 5% expert in those tools it doesn’t matter as much I guess but most people aren’t.
Even a full-featured TUI like Claude Code is highly limited compared to a visual UI. Conversation branching, selectively applying edits, flipping between files, all are things visual UI does fine that are extremely tedious in TUI.
Overall it comes down to the fact that people have to use TUI and that’s more important than it being easy to train, and there’s a reason we use websites and not terminals for rich applications these days.
I use headless mode (-p) and keep my named shell histories as journals (so no curses/TUI or GUI). But session management or branching could improve at the tool level and allow seamless integration with completion tools, which could be a simple fast AI looking at recent sessions or even be visual, say for navigating and extending a particular complex branch of a past session. It is not too hard to work with such shell-based flows within Emacs for me, but it would be nice if there was some standardization effort on the tool fronts and some additional care for future automation. I dont want my AI clicking buttons if it can be precise instead. And I certainly want multithreading. I think of AI more as an OS; it needs a shell more than it needs windows at this point in time.
all the examples for visual UI, are tasks which already are (or soon be) done by the agent, not human. hence not needed.
I suspect that final(*) UI is much more similar to TUI: being kind of conversational (human <> AI). Current GUIs provided by your bank/etc are much less effective/useful for us, comparing to conversation way: 'show/do me sth which I just need'. Not to mention (lack of) walled garden effect, and attention grabbing not in the user interest (popups, self-promo, nagging). Also if taking into account age factor. Also that we do not have to learn, yet another GUI (teach a new bank to your mom ;).
So at least 4 distinct and important advantages for TUI.
My bet: TUI/conversation win (*).
*) there will be some UI where graphical information density is important (air controller?) especially in time critical environments. yet even there I suspect it's more like conversation with dynamic image/report/graph generated on the go. Not the UI per se.
Goody | Remote | $150–250K + equity and benefits | North/South America | Full-time
I'm Mark, the technical co-founder and CTO at Goody. We're building a gifting product that every business can use to recognize employees, retain customers, and accelerate sales. Despite being something everyone does, gifting is one of the areas of commerce yet to be disrupted, and we're working on building the best and most delightful product in this space.
Our product is used by Google, Stripe, Anthropic, Meta, NBCUniversal, Notion, and others, and we also offer a developer API for commerce. Tech stack is Ruby + React + TypeScript, though we're flexible on backend language if you know Python or Node.js better. All roles are full-stack.
We're coming off of a big year and planning for scale in 2026. We have a few new roles to accelerate our growth.
• Staff Software Engineer ($200–250K) — for those who ship at a startup pace and have a great eye for detail
• Senior Software Engineer, Customer Engineering ($150–200K) — if you like to hear a customer request in the morning and tell them it’s shipped in the afternoon. US and Canada only for this one
• Senior Software Engineer, Growth ($150–200K) — be the engineer who has the most direct impact on our growth
• Data Analyst ($80–140K) — build models to help power decision-making across our business (not a dev role but thought I'd include it)
We're looking for people who have great startup energy, want to win, and bring great vibes to our tight-knit team. Great time to join since our offsite in London is happening soon (we're US-based, this is our third international offsite).
Is Postgres fast enough for job processing these days? We do hundreds of millions of jobs now and even years ago when our volume was a fraction of that, we got a huge performance boost moving from Postgres + Que to Redis + Sidekiq. Has that changed in the intervening years?
Hundreds of millions over what time frame? I got a system with Rails/Solid Queue + Postgres and doing about 20M jobs/day on a $45/mo VM with plenty of room to spare.
Solving new problems is a thing engineers get to do constantly, whereas building an agent infrastructure is mostly a one-ish time thing. Yes, it evolves, but I worry that once the fun of building an agentic engineering system is done, we’re stuck doing arguably the most tedious job in the SDLC, reviewing code. It’s like if you were a principal researcher who stopped doing research and instead only peer reviewed other people’s papers.
The silver lining is if the feeling of faster progress through these AI tools gives enough satisfaction to replace the missing satisfaction of problem-solving. Different people will derive different levels of contentment from this. For me, it has not been an obvious upgrade in satisfaction. I’m definitely spending less time in flow.
reply