Hacker Newsnew | past | comments | ask | show | jobs | submit | codebolt's commentslogin

I'm with you all the way here. I derive zero pleasure from simply typing out the code once the spec is clear. Having a fast forward button to skip that phase is a pure win in my book.

I do get pleasure from typing out the code in some languages (and not in others; hello javascript, java!). Similarly, I love writing text with a calligraphy or fountain pen. However, I can't dedicate too of the much work / business time to whatever is more pleasurable.

So, I "doodle" some text / ideas / planning with a calligraphy pen, and type in some code, occasionally, both mainly for the fun aspects. There are side benefits to both, too. Writing some plans slowly and "beautifully" drags them out and I get to think longer on them, so the sporadic "nice looking plans" are often more well thought. And doing the coding all by myself stops my brain from losing the ability. I was initially in the 100% AI-writes-all-code camp for a while and noticed I am getting notably slow in some personal coding skills. It is too early to treat specs as the new code and old languages as assembly (but I admit we might get there some day).

In other words, I think AI doing 90-99% of the coding, depending on the language verbosity and AI accuracy for the code at hand, is quite reasonable.


Personally, this is an experience I thought about first before writing my comment. I think in the days pre-AI coding assists, I believe you describe the intrinsically human experience that's requisite to write code by hand. The wonder, the joy, the frustration, the confusion, the elation--the discovery. These days, the things I wonder about lie deeper and deeper behind more and more lines of code, through journey's that provide less and less joy, and thusly becoming more and more unreachable as I'm human, bound by an excess of things in addition to time. AI has helped me rediscover some of this sporadic creativity demonstrably due its ability to prototype recreational ideas on a whim

Professionally, I'm employed writing safety-critical avionics software. Superfluous amounts of cogent tooling putting guardrails on agents has enabled me to spend heaps more time to think deeply about how the software should work at a systemic level. The code by definition must be heavily criticized and battle-tested before it can go out the door to begin with. Albeit a beautiful part of coding, those sporadic bursts of creativity drive the code leaving my desk less and less, and I feel strongly that has made its quality paradoxically better since I'd spent much more time on broader implications and interactions.


100% the opposite here. I derive all the pleasure from writing code, which is why I'm still writing code.

a spec can be wrong until you prove it is right..

Not what I do. I'll reformulate the ticket description so that the purpose and as many details as possible about the solution are made clear from the start. Then I tell Opus to go and research the relevant parts of the codebase and what needs to be done, and write its findings to a research.md file. Then I'll review that file, bring answers to any open questions and hash out more details if any parts seem fuzzy. When the research is sound I'll ask Opus to produce a plan.md document that lists all the changes that need to be made as actionable steps (possibly broken into phases). Then I'll let Sonnet execute the steps one by one and quickly review the changes as we go along.

You are making it too hard on yourself. Most people would just paste the ticket URL and type "fix this", then spend the next 3 hours on social media.

OTOH, I try hard to provide all possibly relevant context, manually copy/paste logs to reduce context overhead, always ask to produce an implementation plan and review it before making any code changes. Yet I often feel like a dinosaur here, all coworkers who tout "LLM productivity" just type a few words in and let the agent spin for hours without any guidance.


I'd call that irresponsible use. One of the principles I try to stick by is to never offload any major decisionmaking to the LLM without oversight. Because some percentage of the decisions it makes are going to be wrongful (and more often just against my taste).

Just out of curiosity, what type of systems are you working on? What type of features did you implement on your 100k LOC week?

I don't know about the GP, but my workflow is similar to theirs, but I aim to ship low thousands of lines per week. The fewer the better. I even tell the agent to only write high SNR tests, otherwise it just adds useless "make sure this function returns this thing we hardcoded".

I usually succeed, BTW. I spend a lot of time planning, but usually each PR is a few hundred lines, and fairly easily reviewable.

I mostly work with Python backends, though these days it might be any language (Ruby, Go, TS).


> What type of features did you implement on your 100k LOC week?

I work on 3rd party API integrations, of which, we have hundreds, each in its own repo. We need to build thousands more at a fraction of the cost. Any given integration historically takes a human a few days up to a few months to build and is subject to ongoing maintenance. We frequently do not have access to the API and we mostly never have a representative data set if we do. Complex APIs tend to expose multiple, entwined data models. Documentation may be wrong or in a foreign language.

I've been building a new framework to do it better. Ideally, we can get an agent to spit them out in a few minutes to hours with a much reduced ops burden for managing the fleet, all with very high confidence. The later requires pushing as much into the type system as possible and leveraging static analysis. Much of the work has been embarrassingly parallelizable. Consider categorizing access patterns across the entire set or ensuring byte for byte parity (over the input space of third party API responses).

This is absolutely not a problem that a human or 2 could tackle prior to AI.


>I've been building a new framework to do it better. So you're using your software factory to build a software factory. Not building thousands of integrations at a fraction of the cost.

There are many ways to use an LLM to generate a piece of software. I base most of my projects these days around sets of Markdown files where I use AI first to research, then plan and finally track the progress of implementation (which I do step-wise with the plan, always reviewing as I go along). If I was asked to provide documentation for my workflow those files would be it. My code is 99% generated, but I take care to ensure the LLM generates it in a way that I am happy with. I'd argue the result is often better than what I'd have managed on my own.

Yep pretty much same, although if I’m lax at any point of the reviewing (in-progress or final), I’d say the quality quickly drops to below my average manual effort, and then I don’t even have the benefit of thinking it all through as directly. I think getting really quality results out of LLM code generation for non-trivial projects still needs quite a bit of discipline and work.

Anyone test it out for generating 2D art for games? Getting nano banana to generate consistent sprite sheets was seemingly impossible last time i tried a few months ago.


I'm still looking for a free tool to convert images to 3d models well.


Debugging and local testing is the main remaining use case of IDEs. Especially so for mobile apps where you need to manage one or more emulators.


Not on Android, where debugger works once in a blue moon.


At some point I think I'd prefer to deploy my own model in Azure or AWS and simply bring the endpoint to the coding harness.


As a AI-aware software engineer currently creating systems that integrate with LLM provider APIs for my company- who also has no idea what an eval is or how a data scientist thinks about RAG. I honestly don't see what value a data scientist would bring to the table for my team. Maybe someone would care to enlighten me?


Your view of what is happening in the neural net of an LLM is too simplistic. They likely aren't subject to any constraints that humans aren't also in the regard you are describing. What I do know to be true is that they have internalised mechanisms for non-verbalised reasoning. I see proof of this every day when I use the frontier models at work.


Opus 4.6 has a 200k context limit in Copilot. Could be the issue.


IIRC the context limit in Copilot is actually 128k, not 200k.

200k is the normal context limit elsewhere.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: