More

afro88 · 2026-04-24T19:42:38 1777059758

> Writing detailed specs and then giving them to an AI is not the optimal way to work with AI. > That's vibecoding with an extra documentation step.

Read uncharitably, yeah. But you're making a big assumption that the writing of spec wasn't driven by the developer, checked by developer, adjusted by developer. Rewritten when incorrect, etc.

> You can still make the decisions, call the shots

One way to do this is to do the thinking yourself, tell it what you want it to do specifically and... get it to write a spec. You get to read what it thinks it needs to do, and then adjust or rewrite parts manually before handing off to an agent to implement. It depends on task size of course - if small or simple enough, no spec necessary.

It's a common pattern to hand off to a good instruction following model - and a fast one if possible. Gemini 3 Flash is very good at following a decent spec for example. But Sonnet is also fine.

> Stop trying to use it as all-or-nothing

Agree. Some things just aren't worth chasing at the moment. For example, in native mobile app development, it's still almost impossible to get accurate idiomatic UI that makes use of native components properly and adheres to HIG etc

yonaguska · 2026-04-25T01:22:46 1777080166

this is my workflow, converse with it to write a spec. I'm reviewing the spec myself. Ask it to trace out how it would implement it. I know the codebase because it was originally written mostly by hand. Correct it with my best practices. Have it challenge my assumptions and read the code to do so. then it s usually good enough to go on it's on. the beauty of having a well defined spec is that once it's done, I can have another agent review it and it generates good feedback if it deviates from the spec at all.

I'm unsure if this is actually faster than me writing it myself, but it certainly expends less mental energy for me personally.

The real gains I'm getting are with debugging prod systems, where normally I would have to touch five different interfaces to track down an issue, I've just encompassed it all within an mcp and direct my agent on the debugging steps(check these logs, check this in the db, etc)

afro88 · 2026-04-24T03:46:59 1777002419

Experienced engineers that know the codebase and system well, and with enough time to consider the problem properly would likely consider this case.

But if we're vibing... This is the kind of bug that should make it back into a review agent/skill's instructions in a more generic format. Essentially if something is done to the message history, check there tests that subsequent turns work as expected.

But yeah, you'd have to piss off a bunch of users in prod first to discover the blind spot.

afro88 · 2026-04-24T03:26:29 1777001189

Or learnt to use an existing one.

I vibed a low stakes budgeting app before realising what I actually needed was Actual Budget and to change a little bit how I budget my money.

afro88 · 2026-04-24T03:25:11 1777001111

I can't remember what the technique is called, but back in the GPT 4 days there was a paper published about having a number of attempts at responding to a prompt and then having a final pass where it picks the best one. I believe this is part of how the "Pro" GPT variant works, and Cursor also supports this in a way (though I'm not sure if the auto pick best one at the end is part of it - never tried)

afro88 · 2026-04-22T07:12:42 1776841962

That's not the same picture

afro88 · 2026-04-21T22:38:33 1776811113

This is an excellent article. And can I just remind everyone that this is what human authorship looks like? Clearly not LLM generated. It has the author's unique tone, take on the subject, research, clear compelling story... A real breath of fresh air to be honest.

grvdrm · 2026-04-22T11:12:06 1776856326

Recently think that Ben's writing is more complex and verbose than ever, but I agree with your point entirely. He is writing it, not AI. I don't listen to his voiceovers but think of the articles as narrated by a captivating in-person presenter/lecturer.

afro88 · 2026-04-18T04:09:09 1776485349

I got a lot of <empty> as well. But was able to get a slide deck out of it before that happened, and it was reasonably good. Not 1-shot good, but better than what I have gotten out of Opus 4.6 with a skill previously

afro88 · 2026-04-17T19:35:16 1776454516

If you work with an exceptional one, sure

afro88 · 2026-04-17T12:29:07 1776428947

And India. It's a common experience that engineering teams from India will say yes to everything and then do what they think is best. Rather than saying no and explaining what they want to do instead

afro88 · 2026-04-16T09:26:39 1776331599

I believe it generates playwright scripts (non deterministically) which are saved and executed again (deterministically)