One aspect you have to consider is the differences in human beings doing the evaluation. I had a coworker/report who would hand me obvious garbage tier code with glaring issues even in its output, and it would take multiple iterations to address very specific review comments (once, in frustration, I showed a snippet of their output to my nontechnical mom and even my mom wtf’ed and pointed out the problem unprompted); I’m sure all the AI-generated code I painstakingly spec, review and fix is totally amazing to them and need very little human input. Not saying it must be the case here, that was extreme, but it’s a very likely factor.
This is plausible. Assuming it’s true, we would see the adoption of vibe coding at a faster rate amongst inexperienced developers. I think that’s true.
A counterpoint is Google saying the vast majority of their code is written by AI. The developers at Google are not inexperienced. They build complex critical systems.
But it still feels odd to me, this contradiction. Yes there’s some skill to using AI but that doesn’t feel enough to explain the gap in perception. Your point would really explain it wonderfully well, but it’s contradicted by pronouncements by major companies.
One thing I would add is that code quality is absolutely tanking. PG mentioned YC companies adopted AI generated code at Google levels years ago. Yesterday I was using the software of one such company and it has “Claude code” levels of bugginess. I see it in a bunch of startups. One of the tells is they seem to experience regressions, which is bizarre. I guess that indicates bugs with their AI generated tests.
Fairly certain they do something like Anthropic does, they count the acceptance rate or something else that is fairly "optimistic" (my org has a code acceptance rate of 98,5% per the platform dashboard).
So, to clarify, me accepting the suggestion and then correcting it by hand still counts as N LoC accepted.
This is magical because you are both on the exact right path and not right. My theory is there’s a sort of skill to teasing code from AI (or maybe not and it’s alchemy all over again) and this is all new enough and we don’t have a common vocabulary for it that it’s hard for one person who is having a good experience and one person who is not to meaningfully sort out what they are doing differently.
Alternatively, it could be there’s a large swath of people out there so stupid they are proud of code your mom can somehow review and suggest improvements in despite being nontechnical.
> This is magical because you are both on the exact right path and not right. My theory is there’s a sort of skill to teasing code from AI (or maybe not and it’s alchemy all over again) and this is all new enough and we don’t have a common vocabulary for it that it’s hard for one person who is having a good experience and one person who is not to meaningfully sort out what they are doing differently.
I don't think this is a hypothesis.
Outside of asking for one-shot tasks that have been done a million times before, LLMs do not "default" to good work.
If you ask them over-and-over again to find holes in their solution, to fix them, to evaluate for tech debt, to test all cases, to re-asses after the cases if it's architecturally coherent, to compare to the closest available known good implementations, etc etc, they can eventually get what you want done unbelievably cheaply to an acceptable level of quality.
I mentioned initially - their work is unbelievably cheap, you should be EAGER to reject it. Most people wouldn't even bend down to pick a penny up off the sidewalk. They can literally pump out CLs for a penny. You shouldn't even waste time looking at "I'm done" until they've gone through 10+ rounds of reviews, refactors, bug fixes, thought of more test cases, compared to known implementations, etc.
Why are you going to spend ~$50-$100+ of your time reviewing $0.01 of LLM time?! It makes no sense!
If you just listen to them say "I'm done" and move on to their next task, it won't take too many days before you're swimming in a sea of incoherent garbage.
That’s not a given. Self-hosted GitLab on my pretty good hardware is still slow. I just opened a very small repo with ~20 files and ~5 commits. The page spun for 5s+ before showing me the directory listing and readme. Subsequent loads are faster (~1s) but still not instant.
Common pattern of checking the claude code issue tracker for a bug: land on issue #12587, auto closed as duplicate of #12043; check #12043, auto closed as duplicated of #11657; check #11657, auto closed as duplicate of #10645; check #10645, never got a response, or closed as not planned, or some other bullshit.
By having a reasonably successful open source project while in university. Someone reached out with work in a relevant area. I suppose that gate is mostly shut off these days with the volume of vibe-coded crap (or even non-crap) and uptick of clearly fraudulent stars on GitHub.
It wouldn't be so irritating if thinking didn't start to take a lot longer for tasks of similar complexity (or maybe it's taking longer to even start to think behind the scenes due to queueing).
> Both llama.cpp and ollama are great and focused on different things and yet complement each other
According to the article, ollama is not great (that’s an understatement), focused on making money for the company, stealing clout and nothing else, and hardly complements llama.cpp at all since not long after the initial launch. All of these are backed by evidence.
You may disagree, but then you need to refute OP’s points, not try to handwave them away with a BS analogy that’s nothing like the original.
Article only mentions manually building an irregularly shaped region from an image
> Every horizontal run of non-transparent pixels becomes a tiny rectangle region, and those runs are combined into one final window region
but there's an easier way: you just use a LWA_COLORKEY with SetLayeredWindowAttributes to make a color transparent, like a green screen. I recall building my own desktop mascot that way. Doesn't work with arbitrary image/content of course since the color can't appear in the content region.