I’m on tirzepatide and it’s crazy how it can truly reform habits. I’ve been a night snacker my whole life. I don’t even think about food after dinner anymore. At a bar, I used to pound down the last quarter of a beer so I could go get another. Now I might forget to finish it and I probably won’t get another. Now I feel full while eating for the first time in my life. It’s going to be truly transformative at scale, even knowing it doesn’t work for everyone.
My doctor, who is on the older side, told me that he went through his records when GLP-1s started being prescribed for weight loss. He wanted to calculate what percentage of his patients (a) he had advised to lose weight, (b) reduced their weight to healthy levels, (c) and kept it off.
From the starting population of overweight people, only 3% of people dropped down to, and stayed, a healthy weight.
I also think we're going to see a resurgence of either pair programming, or the buddy system where both engineers take responsibility for the prompting and review and each commit has 2 authors. I actually wrote a post on this subject on my blog yesterday, so I'm happy to see other people saying it too. I've worked on 2-engineer projects recently and it's been way smoother than larger projects. It's just so obvious that asynchronous review cycles are way too slow nowadays, and we're DDoSing our project leaders who have to take responsibility for engineering outcomes.
I’ve been playing around with these kinds of prompts. My experience is that the prompts need a lot of iteration to truly one-shot something that is halfway usable. If it’s under-spec’d it’ll just return after 15-20 minutes with something that’s not even half baked. If I give it an extremely detailed spec it’ll start dropping requirements and then finish around the 60-70 minute mark, but I needed 20 minutes to write the prompt and I need to hunt for the things it didn’t bother to do.
I’ve gotten some success iterating on the one-shot prompt until it’s less work to productionize the newest artifact than to start over, and it does have some learning benefits to iterate like this. I’m not sure if it’s any faster than just focusing on the problem directly though.
The dropping requirements problem is real. What's helped us is breaking the spec into numbered ACs and having the verification run per-criterion. If AC-3 fails you know exactly what got dropped.
When I was on Google Docs, I watched the Google Forms team build a sophisticated ML model that attempted to detect when people were using it for nefarious purposes.
It underperformed banning the word "password" from a Google Form.
I wonder if this is just an example of Goodhart's law. How did they measure performance of those models? I would imagine they tried measuring against known cases of forms misuse, aka those forms that contained 'password' field.
I followed his blog back when he started this descent, and I have a theory that it was hill climbing.
He used to blog about pretty innocent stuff; his wife making fun of him for wearing pajama pants in public, behind the scenes on drawing comics, funny business interactions he'd had. But then he started getting taken out of context by various online-only publications, and he'd get a burst of traffic and a bunch of hate mail and then it'd go away. And then he'd get quoted out of context again. I'm not sure if it bothered him, but he started adding preambles to his post, like "hey suchandsuch publication, if you want to take this post out of context, jump to this part right here and skip the rest."
I stopped reading around this point. But later when he came out with his "trump is a persuasion god, just like me, and he is playing 4d chess and will be elected president" schtick, it seemed like the natural conclusion of hill climbing controversy. He couldn't be held accountable for the prediction. After all, he's just a comedian with a background in finance, not a politics guy. But it was a hot take on a hot topic that was trying to press buttons.
I'm sure he figured out before most people that being a newspaper cartoonist was a downward-trending gig, and that he'd never fully transition to online. But I'm sad that this was how he decided to make the jump to his next act.
Ahh, so that's what I've internally called "The Sharpiro Effect" really is. Though it's still a bigger shame that a philosophy professor would need to resort to this compared to a newpaper cartoonist.
I should have clarified for people who had the good fortune to not be exposed to these posts, but that was usually his lead-in to his ultra toxic writing. i.e. it was an engaging hook that led to more engaging trolling
"Chuck Norris facts" was a text-only meme format from the mid '00s. Stuff like "Chuck Norris is the only man to ever defeat a brick wall in a game of tennis" or "When Chuck Norris does push-ups, he doesn't push himself up, he pushes the Earth down." The Jeff Dean Facts use the same format. It doesn't have anything to do with Chuck Norris himself.
I vaguely remember another instance of this around a guy in the army - I forgot if it was at boot camp or what the rank was, but it was something along the lines of “things I’m no longer allowed to do” and just had a bunch of silly military joke/prank type things… man I wonder if I could dig that up again, I think it might have been late 90s internet.
That would be Skippy's List[0], which as far as I know is the seminal work in the genre (at least on the internet). I originally learned about it through a (rather less compact) version about someone's D&D crimes[1], which was closer to my cultural wheelhouse, but the original holds up even if you have to google some phrases.
With the current crop of LLMs/agents, I find that refactors still have to be done at a granular level. "I want to make X change. Give me the plan and do not implement it yet. Do the first thing. Do the second thing. Now update the first call site to use the new pattern. You did it wrong and I fixed it in an editor; update the second call site to match the final implementation in $file. Now do the next one. Do the next one. Continue. Continue.", etc.
reply