> It's been years of development and you still can't trust it to get basic facts correct
There's the rub: AI is not an oracle. It's neither designed nor intended to provide accurate recall of all facts. It's closer to a reasoning engine than anything IMO.
Oh, and for the record: I don't trust people to get basic facts correct, either. It's already far better than the average human at trivia.
For personal projects that I don't plan to share widely, I'm making it a point to not look at the code at all. So far - and to my surprise - I've not only found that this has result in no more bugs than before, but it seems to result in fewer bugs over time. Every time I find a bug or a regression, I add it to the specification. My SDLC requires that every specification have at least one associated test. Not every function, or every line, or anything like that - every specified feature. The end result has been that my projects have matured over time much faster than if I'd been more closely involved.
I've already toyed with writing some projects in Nim and Haskell for token efficiency. At some point I plan to put together a simple test project, then do a comparison of token efficiency with every language I can think of to find the one that I'm able to generate most quickly, correctly, and cheaply.
Your experiences must be much different from mine.
Three years ago, AI was barely able to provide sort-of reliable command completion.
Two years ago, it could extrapolate a single function from a docstring - but the docstring had to be so verbose that it wasn't practical to use in that way.
A year ago, I was tinkering with Devin to try to find a way to get it to reliably implement small, isolated features from verbose Jira tickets.
Six months ago, I started using AI to generate the majority of my code output. Most of my time was spent reviewing, and I was ecstatic to reach ~2x output because I could run the next task while reviewing the last.
Now, at work I'm managing a half dozen Claude Code instances, Devin sessions, and orchestrating a review loop between Claude, Devin, and CodeRabbit. It's not uncommon for me to be working on four or more discrete features at once. My output is approximately 15x my pre-AI baseline - and I've not sat down and written a line of code directly in six months.
At home I'm managing a Hermes agent that can spin up a whole fleet of purpose-tuned agents for whatever purpose I'd like. I've implemented spec-driven development a'la Acai, and extended it to the point that my agent creates specs from text or voice conversation, I review them, and it handles implementation end-to-end. The code itself is an almost disposable artifact - useful primarily to ensure no regressions have been introduced between rounds.
... I simply don't understand how you can assert that "it's been basically the same for 3 years". It absolutely has not.
It sounds like our experiences are different. My software work isn’t on products where code can be disposable, since it affects people’s lives in material ways. I’m not sure why you’re launching fleets of agents at home, either.
Cmon - cursor has been out for like 3.5 years at this point. AI was still in its infancy but it was definitely able to complete tasks, albeit smaller ones.
Not disputing the overall trajectory, yeah it’s gotten better. But it was definitely capable of more than just command completion 3 years ago.
I reach for it more frequently. But personally, it’s at the point of diminishing returns for my work. It’s capable enough now to handle most of the things I want to throw at it, sometimes it’s wrong, sometimes it’s right.
I’m not doing cutting edge deep tech work - and I also don’t have the motivation (or salary increase) to be 15X more productive, if that’s even measurable. We are so busy because the CEO hears these “15X” statements and then the pressure is on to match or exceed that, and I’m not playing that game.
Yeah, I agree - I get what the author is saying, but I also don't expect "translator" to be a practical career path in the future.
Even small, dumb, local models are excellent at translation already. Frontier models are on par or better than the human translations we've tested them against at work.
The idea of not naming G-d is based on the concept of being human comprehension and not within realm of human language. G-d can name things in our world but it’s a one way street. For humans it’s presumptions to assign a name to G-d. it implies an understanding that can’t exist.
What would cause an increase in the number of open lower paid and/or service industry jobs while simultaneously reducing the number of openings in tech?
Every single one of those examples is both valid and - I believe, at least - misunderstood.
Musk has a singular goal as far as I can tell: to make humanity a multi-planetary species. All of those things are testing the boundaries of what's possible in areas that will or could be very important for building a permanent settlement on Mars.
I posit that while there's much room for debate around whether or not those projects are viable, as far as I can tell everything Musk has done has been in service of building the corporate framework, talent pool, skills, and technology necessary to colonize Mars.
Ok. A permanent settlement on Mars. Given the personal control structure at play in his companies giving him autocratic control there, why would anyone believe he wouldn’t be anything but a Martian autocrat, and who in their right mind would willingly submit their own sovereignty to Emperor Musk the First of Mars? It’s not exactly like you could change your mind and walk away. You’d be literally putting your life in the hands of this wildly erratic person.
While my own views are likely closer to Musk's than I suspect yours are, I share those concerns. I don't think I'd be interested in moving there, at least not in the initial waves.
My optimistic view of the future looks more like "The Moon is a Harsh Mistress" than "Total Recall".
I'm sorry, you'll have to do an awful lot of explaining to do if you want me to believe Hyperloop, a 100+ year idea proven not to work, and the Vegas Loop, one of the most asinine infrastructure projects I've ever witnessed in my life, could possibly in any way whatsoever contribute to life on Mars
Terrain is the best available protection on Mars from radiation, and is it's far easier to pressurize a structure excavated from bedrock than it is to pressurize a dome or similar.
Hyperloop and the Vegas Loop are projects used to justify the existence of the Boring Company. The Boring Company's tech is definitely relevant to extraterrestrial habitat creation.
Rather than going into detail, getting myself on yet another watchlist, and possibly inspiring someone to do something both criminal and counter-productive, I'll just quote Patton:
Fixed fortifications are a monument to the stupidity of man.
There's the rub: AI is not an oracle. It's neither designed nor intended to provide accurate recall of all facts. It's closer to a reasoning engine than anything IMO.
Oh, and for the record: I don't trust people to get basic facts correct, either. It's already far better than the average human at trivia.
reply