Hacker Newsnew | past | comments | ask | show | jobs | submit | afro88's commentslogin

> Writing detailed specs and then giving them to an AI is not the optimal way to work with AI. > That's vibecoding with an extra documentation step.

Read uncharitably, yeah. But you're making a big assumption that the writing of spec wasn't driven by the developer, checked by developer, adjusted by developer. Rewritten when incorrect, etc.

> You can still make the decisions, call the shots

One way to do this is to do the thinking yourself, tell it what you want it to do specifically and... get it to write a spec. You get to read what it thinks it needs to do, and then adjust or rewrite parts manually before handing off to an agent to implement. It depends on task size of course - if small or simple enough, no spec necessary.

It's a common pattern to hand off to a good instruction following model - and a fast one if possible. Gemini 3 Flash is very good at following a decent spec for example. But Sonnet is also fine.

> Stop trying to use it as all-or-nothing

Agree. Some things just aren't worth chasing at the moment. For example, in native mobile app development, it's still almost impossible to get accurate idiomatic UI that makes use of native components properly and adheres to HIG etc


this is my workflow, converse with it to write a spec. I'm reviewing the spec myself. Ask it to trace out how it would implement it. I know the codebase because it was originally written mostly by hand. Correct it with my best practices. Have it challenge my assumptions and read the code to do so. then it s usually good enough to go on it's on. the beauty of having a well defined spec is that once it's done, I can have another agent review it and it generates good feedback if it deviates from the spec at all.

I'm unsure if this is actually faster than me writing it myself, but it certainly expends less mental energy for me personally.

The real gains I'm getting are with debugging prod systems, where normally I would have to touch five different interfaces to track down an issue, I've just encompassed it all within an mcp and direct my agent on the debugging steps(check these logs, check this in the db, etc)


Experienced engineers that know the codebase and system well, and with enough time to consider the problem properly would likely consider this case.

But if we're vibing... This is the kind of bug that should make it back into a review agent/skill's instructions in a more generic format. Essentially if something is done to the message history, check there tests that subsequent turns work as expected.

But yeah, you'd have to piss off a bunch of users in prod first to discover the blind spot.


Or learnt to use an existing one.

I vibed a low stakes budgeting app before realising what I actually needed was Actual Budget and to change a little bit how I budget my money.


I can't remember what the technique is called, but back in the GPT 4 days there was a paper published about having a number of attempts at responding to a prompt and then having a final pass where it picks the best one. I believe this is part of how the "Pro" GPT variant works, and Cursor also supports this in a way (though I'm not sure if the auto pick best one at the end is part of it - never tried)

That's not the same picture

This is an excellent article. And can I just remind everyone that this is what human authorship looks like? Clearly not LLM generated. It has the author's unique tone, take on the subject, research, clear compelling story... A real breath of fresh air to be honest.

Recently think that Ben's writing is more complex and verbose than ever, but I agree with your point entirely. He is writing it, not AI. I don't listen to his voiceovers but think of the articles as narrated by a captivating in-person presenter/lecturer.

I got a lot of <empty> as well. But was able to get a slide deck out of it before that happened, and it was reasonably good. Not 1-shot good, but better than what I have gotten out of Opus 4.6 with a skill previously

If you work with an exceptional one, sure

And India. It's a common experience that engineering teams from India will say yes to everything and then do what they think is best. Rather than saying no and explaining what they want to do instead

I believe it generates playwright scripts (non deterministically) which are saved and executed again (deterministically)

Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: