More

MagicMoonlight · 2026-04-29T20:21:24 1777494084

Another slop coded piece of shit causing stupid bugs.

I can’t believe they paid 100m for some of these employees. They could have bought entire companies of real developers.

PunchyHamster · 2026-04-29T20:57:32 1777496252

Oh, no it was absolutely on purpose. Why else you'd have code that looks for a certain string in commit and does the reroute ?

MagicMoonlight · 2026-04-29T03:23:21 1777433001

“Think of the children”

MagicMoonlight · 2026-04-29T03:22:37 1777432957

Yeah you have no clue what Claude code is actually doing. Any “thoughts” it tells you are slopped out separately and deliberately fake.

It could be deleting all of your files, it could be inserting vulnerabilities, you have no idea.

jmalicki · 2026-04-29T14:34:30 1777473270

Have you seen documentation that the thoughts in Claude Code are slipped out separately, authoritative or otherwise? I've heard this claimed a few times and wondering what they're doing differently from traditional thinking models.

wswope · 2026-04-29T16:47:28 1777481248

What people typically mean by the GP statement is that the “thinking” mode of these models is loosely analogous to what humans do: a bit of a retrograde reconstruction of how we arrived at a gestalt conclusion that sounds good, but may not accurately reflect the real logic at play.

IME you can see this more easily with less-polished models like Deepseek 3.X, where the reasoning in the thinking traces occasionally contradicts or has zero bearing on the non-thinking output.

jmalicki · 2026-04-29T16:56:20 1777481780

Of course that can happen!

But they are actual tokens produced, that are then read by the answer generation as part of the prompt, nonetheless. And the hidden state of course has a ton of logic that may not be apparent by the tokens produced as well!

Unlike humans, this thinking cannot possibly be retrograde, since causal masking means it is strictly generated before the answer and cannot be affected by it (though the model may have some concept of an answer by the time it starts generating the thinking tokens, and there is no guarantee the thoughts generated by thinking are actually attended to by the text generation).

2ndorderthought · 2026-04-29T10:50:04 1777459804

I'll never forget watching a product manager struggle to keep their saliva in their mouth after seeing a Claude demo. Some peoples greatest thrill is slop. "Oh yea baby tell me more about how you automated that new feature I ran past no one while you reformatted my hard drive oooo sooo good".

MagicMoonlight · 2026-04-28T20:42:33 1777408953

If I “learned” your essay and handed it in, would you be happy with that?

MagicMoonlight · 2026-04-28T14:54:27 1777388067

Then give up streetlights. Although crime and road accidents will massively increase and women won’t feel safe outdoors. So is it really worth it?

joshmoody24 · 2026-04-28T19:14:09 1777403649

We don't need to give up streetlights to make huge progress. 20 to 50 percent of outdoor lighting is wasted. Just shielding streetlights would go a long way.

https://www.nps.gov/subjects/nightskies/sources.htm

Miraltar · 2026-04-28T15:13:50 1777389230

Women would feel less safe yes but there is no evidence that crime and road accidents would increase.

MagicMoonlight · 2026-04-28T14:53:13 1777387993

So what? Astronomy doesn’t actually produce anything meaningful.

Hell, astronomers were telling us the sun orbited the earth for 99% of human history. Shoot forward to the present day and they can tell us… the universe started at some point somehow. Great job guys. Really earning those billions in grants.

Actually going to space has far more value.

ragebol · 2026-04-28T15:14:56 1777389296

Have you heard of Kessler Syndrome?

More satellites means higher risk on that happening and not going to space until all the debris of a collision deorbits.

MagicMoonlight · 2026-04-28T14:49:35 1777387775

Why would it be delusion? It’s making something up which isn’t there and describing it.

WarmWash · 2026-04-28T14:52:28 1777387948

A hallucination is a false sensory experience.

A delusion is a false mental belief.

Basically hallucinations are false external things, and delusions false internal things. You hallucinate a pink elephant, you delude yourself into thinking trump won 2020.

MagicMoonlight · 2026-04-28T14:47:33 1777387653

Thoughts: - The user is challenging me on our partnership with Bruno Mars, but factual sources including presentation material and trusted websites all confirm it. - I need to square the circle and handle the user’s distrust, without lying and pretending that we aren’t partnered with Bruno Mars

MagicMoonlight · 2026-04-27T01:47:23 1777254443

This article says anthropic models can write out the entire benchmark solution set word for word from memory

MagicMoonlight · 2026-04-26T21:37:25 1777239445

Actually no, it will increase it. Because it’ll be trained with the deletion command as a valid output.

simonh · 2026-04-26T23:35:30 1777246530

Exactly. It’s just giving the LLM a token pattern, and it’s designed to reproduce token patterns. That’s all it does. At some point generating a token pattern like that again is literally it’s job.

nh2 · 2026-04-27T14:05:57 1777298757

Why would one set up reinforcement learning like that?

The point of creating samples from user data should surely be to label them good or bad, based on the whole conversation.

You look at what happened eventually, judge the outcome as bad, and thus train the "rm" token in the middle to be less likely.

simonh · 2026-04-27T14:15:32 1777299332

It is possible, but it requires specifically labelling the data. You have to craft question response pairs to label. But even then the result is only probabilistic.

The LLM in this case had been very thoroughly trained and instructed quite specifically not to do many of the things it actually then when off and did.

It may be that there's a kind of cascade effect going on here. Possibly once the LLM breaks one rule it's supposed to follow, this sets it off on a pattern of rule violations. After all what constitutes a rule violation is there in the training set, it is a type of token stream the LLM has been trained on. It could be the LLM switches into a kind of black hat mode once it's violated a protocol that leads it down a path of persistently violating protocols, and given the statistical model some violations of protocol are always possible.

My mother was a primary school teacher. She used to say that the worst thing you can say to a bunch of kind leaving class down the hall is "don't run in the hall". It puts it in their minds. You need to say "Please walk in the hall", then they'll do it.