Stars have been useless as signals for project quality for a while. They’re mostly bought, at this point. I regularly see obviously vibe-coded nonsense projects on GitHub’s Trending page with 10,000 stars. I don’t believe 10,000 people have even cloned the repo, much less gotten any personal value from it. It’s meaningless.
I'm with you on all points except for it being bought.
Programming has long succumbed to influencer dynamics and is subject to the same critiques as any other kind of pop creation. Popular restaurants, fashion, movies - these aren't carefully crafted boundary pushing masterpieces.
Pop books are hastily written and usually derivative. Pop music is the same as is pop art. Popular podcasts and YouTube channels are usually just people hopping unprepared on a hot mic and pushing record.
Nobody is reading a PhD thesis or a scholarly journal on the bus.
The markers for the popularity of pop works are fairly independent from the quality of their content. It's the same dynamics as the popular kid at school.
So pop programming follows this exact trend. I don't know why we expect humans to behave foundationally differently here.
> Nobody is reading a PhD thesis or a scholarly journal on the bus.
As someone who is involved in academia, I can attest that most of my colleagues (including myself) do in fact read quite a few papers on buses (and trams - can't forget those)
> I'm with you on all points except for it being bought.
Stars get bought all the time. I've been around startup scene and this is basically part of the playbook now for open core model. You throw your code up on GitHub, call it open source, then buy your stars early so it looks like people care. Then charge for hosted or premium features.
There's a whole market for it too. You can literally pay for stars, forks, even fake activity. Big star count makes a project look legit at a glance, especially to investors or people who don't dig too deep. It feeds itself. More people check it out, more people star it just because others already did.
I have 60-ish repos, vast majority are zero star, one or two with a star or two, one with 25-ish. It’s a signal to me of interest in and usage of that project.
Doesn’t mean stars are perfect, or can’t be gamed, or anything in a universally true generalization sense. But also not meaningless.
Ape thinking is a cognitive practice where a human deliberately solves problems with their own mind. Practitioners of ape thinking will typically author thoughts by thinking them with their own brain, using neurons and synapses.
The term was popularized when asking a computer to do it for you became the dominant form of cognition. "Ape thinking" first appeared in online communities as derogatory slang, referring to humans who were unable to outsource all their thinking to a computer. Despite the quick spread of asking a computer to do it for you, institutional inertia, affordability, and limitations in human complacency were barriers to universal adoption of the new technology.
Their design approach wasn’t particularly unusual, so I’m not sure what that sentence means.
I do miss the days when technical reports were clear and concise. This one has some interesting information, but it’s buried under a mountain of empty AI-written bloat.
It's annoying because it is a super common widget and it is interesting work, the first draft or literally even prompt they gave the AI probably would've been a great post, all they had to do was not ensloppify it...
> but the number of problems requiring deep creative solutions feels like it is diminishing rapidly.
If anything, we have more intractable problems needing deep creative solutions than ever before. People are dying as I write this. We’ve got mass displacement, poverty, polarization in politics. The education and healthcare systems are broken. Climate change marches on. Not to mention the social consequences of new technologies like AI (including the ones discussed in this post) that frankly no one knows what to do about.
The solution is indeed to work on bigger problems. If you can’t find any, look harder.
I’m honestly surprised LLMs are still screwing up citations. It does not feel like a harder task than building software or generating novel math proofs. In both those cases, of course, there is a verifier, but self-verification with “Does this text support this claim?” seems like it ought to be within the capabilities of a good reasoning model.
But as I understand the situation, even the major Deep Research systems still have this issue.
The article presents AGENTS.md as something distinct from Skills, but it is actually a simplified instance of the same concept. Their AGENTS.md approach tells the AI where to find instructions for performing a task. That’s a Skill.
I expect the benefit is from better Skill design, specifically, minimizing the number of steps and decisions between the AI’s starting state and the correct information. Fewer transitions -> fewer chances for error to compound.
1. Those I force into the system prompt using rules based systems and "context"
2. Those I let the agent lookup or discover
I also limit what gets into message parts, moving some of the larger token consumers to the system prompt so they only show once, most notable read/write_file
Am I wrong that this entire approach to agent design patterns is based on the assumption that agents are slow? Which yeah, is very true in January 2026, but we’ve seen that inference gets faster over time. When an agent can complete most tasks in 1 minute, or 1 second, parallel agents seem like the wrong direction. It’s not clear how this would be any better than a single Claude Code session (as “orchestrator”) running subagents (which already exist) one at a time.
It's likely then that you are thinking too small. Sure for one off tasks and small implementations, a single prompt might save you 20-30 mins. But when you're building an entire library/service/software in 3 days that normally would have taken you by hand 30 days. Then the real limitation comes down to how fast you can get your design into a structured format. As this article describes.
still seems slow! I’m saying what happens in 2028 when your entire project is 5-10 minutes of total agent runtime - time actually spent writing code and implementing your plan? Trying to parallelize 10m of work with a “town” of agents seems like unnecessary complexity.
I think that most of the anecdotal and research experiences I've seen for AI agent use so far tell us that you need at least a couple pass-throughs to converge upon a good solution, so even in your future vision where we have models 5x as good as now, I'll still need at least a few agents to ensure I arrive at a good solution. By this I specifically mean a working implementation of the design, not an incorrect assumption of the design which leads the AI off on the wrong path which I feel like is the main issue I keep hearing over and over. So coming back to your point, assuming we can have the 'perfect' design document which lays out everything, yeah we'll probably only need like 5 agents total to actually build it in a few years.
I agree that LLMs can be useful companions for thought when used correctly. I don’t agree that LLMs are good at “supplying clean verbal form” of vaguely expressed, half-formed ideas and that this results in clearer thinking.
Most of the time, the LLM’s framing of my idea is more generic and superficial than what I was actually getting at. It looks good, but when you look closer it often misses the point, on some level.
There is a real danger, to the extent you allow yourself to accept the LLM’s version of your idea, that you will lose the originality and uniqueness that made the idea interesting in the first place.
I think the struggle to frame a complex idea and the frustration that you feel when the right framing eludes you, is actually where most of the value is, and the LLM cheat code to skip past this pain is not really a good thing.
I often discuss ideas with peers that I trust to be strong critical thinkers. Putting the idea through their filters of scrutiny quickly exposes vulnerabilities that I'd have to patch on the spot, sometimes revealing weaknesses resulting from bad assumptions.
I started to use LLMs in a similar fashion. It is a different experience. Where a human would deconstruct you for fun, the LLM tries to engage positively by default. Once you tell it to say it the way it is, you get the "honestly, this may fail and here's why".
To my assessment, an LLM is better than being alone in a task and that is the value proposition.
reply