Hacker Newsnew | past | comments | ask | show | jobs | submit | CraigJPerry's commentslogin

I prefer "the bottleneck is understanding" framing.

The author is nibbling at the same problem ultimately, but i don't think "hey one strategy is we could just let cognitive debt accumulate so we can go faster!" is a particularly insightful tool in the toolbox. Don't misread me, i'm not denying it can be a valid strategy.

Instead i want to read about insightful strategies for optimising that system-wide bottleneck we have: understanding.

Tell me about how you managed to shift to a higher level of abstraction, tell me about how and when that abstraction leaks. Tell me how you reduced the amount of information that has to flow through the system bottleneck.


>> many small decisions

It’s making guesses not decisions, framing as decisions will lead you astray to wasted time and tokens.

It’s vaguely productive to tell them a ton of relevant info upfront attempting to minimise their need for load bearing guesses. I say vaguely because obedience is generally only around the level where it's good enough to lull you into a false sense of security, not to actually be obedient.

It’s a bit more productive to use the various loop mechanisms (hooks, /goal etc) to evaluate each end of turn against guard rails and reject with clear instruction on whats unacceptable. Obviously if you only do this without the front load of info then you’re likely to spend more tokens to reach a satisfactory end of iteration.


If I perfectly know all the guardrails I need, I don't need an LLM, only Prolog.

That was a really thoughtfully put together answer. Kudos, i enjoyed reading that

i turned off adguard on ios safari to see what you're talking about... oh wow! You really didn't overstate it.

In other news, my appreciation of the block lists i have configured in adguard just notched up a few more points!


>> like a Rolex watch

A Grand Seiko could be an apt comparison, this is hand finished rather than mass produced on a production line. Also, by a Japanese craftsperson using a prized skill (lacquer vs zaratsu).

>> vanity item

Who covets a calculator? The attraction here is surely celebrating the craftsmanship and the story / history behind the product and firm that produced it.


But surely, a Casio would be the fitting watch to wear while wielding this work: https://www.casio.com/us/watches/gshock/product.MRG-B2000JS-...


10x what? 10x revenue? 10x features shipped? Whats the measure, is it 10x speed of dev like parent comment? Because an unqualified 10x could mean 10x SLOC which is trivial with an agent but has negative value.

Assuming 10x on the speed of dev, Is the vscode repo a decent example? Recently they've been all in on AI augmented development so i'm thinking they'd be a reasonable subject?

How do you isolate out what counts as the "development" part of their delivery cycle (is that the dev inner loop, does that show up in frequency of commits then?) to measure it and see if it's running 10x?

https://github.com/microsoft/vscode/graphs/contributors?from...


Guarantee it’s the same story I get from all my friends/co workers who are now 10x… they are 10 times faster at starting random projects that get to 80% done that they can’t finish, so they immediately move on to the next project because their velocity is so high


From a software quality or software engineering POV --> this is clearly not building durable value, not scalable, etc. So I'd agree with you there.

But from the POV of say, a young startup company looking for PMF and navigating the ambiguities involved with trying to figure out what is the "right thing" to build that will appeal/delight/convince-people-to-pay --> being 10X faster at shipping 80% done projects, is actually incredibly, unfathomably valuable of a superpower. And it is also rationally the "right thing" to do, to make lots of cheap bets and fail/learn fast.

I find that many folks on my team (I am a manager/leader of small-to-mid size eng org), struggle with accepting the nuances of knowing the difference between different projects (where same team may need to do both kinds of work, all the time and in parallel):

- "Hey, the company needs, and you and I both agree, that this situation calls for building/renovating a skyscraper --> please design a fucking strong/safe/reliable skyscraper and don't take any shortcuts, this requires 'real' engineering"

- Vs, "Hey, the company isn't sure what it needs, and neither you nor I know any better either, so let's try a bunch of different shacks/sheds/treehouses/whatever, until we find something that has traction / makes us money (and it's okay if the shed collapses -- so long as the business knows this too, that it wasn't meant to be a load-bearing, skyscraper-esque thing anyways)"

I won't get into the rabbit hole of talking about dealing with bad business leaders, who want a skyscraper but expect to pay the price of a shack/shed. Let's assume that we are talking about the type of companies (maybe the minority) that are reasonable enough to know and acknowledge the difference. Then what is the game-theoretic/rational thing for them to do, and how does this 10X idea express itself? That's where my argument is coming from.


Its 10x code generation with .5x quality at best and all other parts of the SDLC are at 1.x or worse.

AI is not delivering 10x shareholder value, anywhere. Software developers have quite the level of hubris about how important they are to companies. Yes our work is very complex and takes a certain mindset to do it well. It takes a lot of other roles to have a successful business, many of those roles will use AI to help draft slide decks, emails, etc. and that's the limit for them.

Look at recent companies doing layoffs claiming its because of AI, like CloudFlare and Coinbase, do their reported financials paint the picture that they are crushing it with AI? No, its net losses into the $100's of millions.


> AI is not delivering 10x shareholder value, anywhere.

A bit facetious, but I'd expect Nvidia and the like providing the "AI equipment" to have a 10× share value at least…


As always, the real money in a gold rush goes to the people selling shovels.

I guess it doesn't have to use JavaScript for the back behavior. It could use a server-side rendered referrer if that hasn't been stripped by the browser?

You say that JavaScript and fallbacks for menus is a solved issue but the number of menus that are just an absolute clusterfuck is ridiculous on the web today. They're really not a solved issue, Progressive enhancement is hard to do. Genuinely hard in some cases.

On balance, while this is not without flaws, it's interesting. Accessibility, deep linking, reduction in cognitive load for the developer. There's some merit here.


OrbStack is impressive on the performance and energy efficiency fronts. I'm not aware of anything that comes close. But they're doing something funky under the covers. You can't just start any OS in a VM. It has to be somehow mangled to suit their VM. Thankfully NixOS is available so I'm fine for my use cases. It's still remarkable how efficient it is.


Yeah, it's like WSL. It starts just one VM and then your individual "machines" are LXC containers underneath. If you peek at the vendor-supplied file your NixOS OrbStack Machine includes you can see some of it.

They're constantly doing other optimizations in other ways, too. But that's the one you were pointing at, I think.


That's also what Colima does.

OrbStack isn't open-source though and I can't justify buying a license for every single person in my company just for something functionally equivalent but performing better.

These kinds of things should just be provided by Apple as a first-class thing.


The cheapest copilot plan felt totally unsustainable to me. For around £8 month i was getting 100 opus 4.6 prompts (albeit with a reduced context window size around 128k iirc vs 200k to 1m for first party hosted opus). Gpt5.4 was hosted with 400k context iirc.

On top of that, you’ve got 2000minutes of container runtime, so running cloud agents was included. As was anthropic agent sdk mode via copilot which is very comparable with claude code - not identical, the anthropic “modular prompt” is much leaner in the sdk version.

I cant say im mad, i got above what i paid in value. That said, going forward ill probably go back to openrouter payg rather than a subscription.

I got a free 3months of the gemini £19 plan and ive been playing quite a bit, 3.1 pro is a good model, i just find it slow. Flash i think i under appreciated until now.


I AM mad, because I just signed up for my private copilot sub. Otoh I do have it at work. I suspect we will start noticing differences there, too, but it all depends on the top users, vs average user since it is now pooled, IF they dont just allow extra usage. Bu that has become more expensive now, so not sure what they will do. I am hopeful that very few actually even use it. In my and surrounding teams, only maybe 1 out of 3 people really use it much.


>> Can't think of anything it's currently lacking.

Speed? The pro models are slow for me

The model 3.1 pro model is good and i don't recognise the GP's complaint of broken tool calls but i'm only using via gemini cli harness, sounds like they might be hosting their own agentic loop?


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: