More

bofadeez · 2026-03-31T03:13:57 1774926837

Lol this is so naive and optimistic. Claude will just do whatever it wants and apologize later. This is good for action #1 though.

bofadeez · 2026-03-12T20:19:51 1773346791

They still wear crowns?

bofadeez · 2026-02-26T21:25:20 1772141120

LLMs are designed to fool you into thinking they're right by providing plausible answers.

Stop anthropomorphizing intermediate tokens as "reasoning" when all it can do is rationalize.

E.g. "This test script failed but probably for an unrelated reason. I'll mark it done and move on."

skybrian · 2026-02-27T06:50:54 1772175054

The point of the "reasoning" is to generate ideas that might get it unstuck. Ignoring "irrelevant" stuff is one way of getting unstuck.

bofadeez · 2026-02-21T07:12:42 1771657962

One agent can't even be trusted to think autonomously much less a tree of them

onion2k · 2026-02-21T11:33:57 1771673637

Trust is not objective. It's built between parties over time by looking at actions and the results of those actions. In other words, it's entirely subjective based on what's happened between the parties involved. You haven't built that trust with AI agents, or the agents have done things to lose that trust (assuming you've tried), but others have. You can't just dismiss their experience as invalid compared to your own.

4b11b4 · 2026-02-21T16:30:00 1771691400

Agree

bofadeez · 2026-02-10T05:11:45 1770700305

Huh? https://alignment.anthropic.com/2026/hot-mess-of-ai/

bofadeez · 2026-02-10T05:11:00 1770700260

Ask any SOTA AI this question: "Two fathers and two sons sum to how many people?" and then tell me if you still think they can replace anything at all.

curious_af · 2026-02-10T06:54:30 1770706470

What answer do you expect here? There's four people referenced in the sentence. There's more implied because of Mothers, but if you're including transient dependencies, where do we stop?

ketzu · 2026-02-10T10:48:50 1770720530

It can also be 3 people, as one person can be a father and a son at the same time. If you allow non-mentioned people to be included in the attribute (i.e. the sons of the fathers are not part of the 2) it could also be 2 people, as long as they are fathers.

bofadeez · 2026-02-11T06:01:55 1770789715

Just follow up with "it's not a riddle" and the LLM will answer your question.

TuxSH · 2026-02-10T14:06:43 1770732403

If you force it to use chain-of-thought: "Two fathers and two sons sum to how many people? Enumerate all the sets of solutions"

"Assuming the group consists only of “the two fathers and the two sons” (i.e., every person in the group is counted as a father and/or a son), the total number of distinct people can only be 3 or 4.

Reason: you are taking the union of a set of 2 fathers and a set of 2 sons. The union size is 2+2−overlap, so it is 4 if there’s no overlap and 3 if exactly one person is both a father and a son. (It cannot be 2 in any ordinary family tree.)"

Here it clearly states its assumption (finite set of people that excludes non-mentioned people, etc.)

https://chatgpt.com/share/698b39c9-2ad0-8003-8023-4fd6b00966...

bofadeez · 2026-02-11T05:13:48 1770786828

Then you'll ask it to evaluate the possible solutions and it will forget the original problem entirely by the time it's done enumerating solutions.

Great job, AI labs! It's almost TOO useful

topaz0 · 2026-02-10T14:41:08 1770734468

Every father is a son to somebody...

Der_Einzige · 2026-02-10T05:26:42 1770701202

This is undefined. Without more information you don’t know the exact number of people.

Riddle me this, why didn’t you do a better riddle?

bofadeez · 2026-02-11T05:52:57 1770789177

Person 1: "I need chairs for two fathers and two sons to sit"

Person 2: 'Okay, I have no idea how many chairs to grab, not enough information' - nobody ever

(Person 2 has no ability to contribute to anything of economic value.)

Der_Einzige · 2026-02-11T21:37:28 1770845848

Anyone who talks like person 1 contributes negative economic value.

bofadeez · 2026-02-12T02:31:56 1770863516

No sounds like a normal person lol. Just ask an LLM why I'm right and you're wrong. You're welcome.

mjevans · 2026-02-10T06:09:48 1770703788

No, but you can establish limits, like the total set of possible solutions.

ghostly_s · 2026-02-10T05:21:15 1770700875

I just did. It gave me two correct answers. (And it's a bad riddle anyway.)

bofadeez · 2026-02-11T05:09:40 1770786580

Oh you forgot to say "it's not a riddle" and then get the right answer lol

harry8 · 2026-02-10T05:15:27 1770700527

GPT-5 mini:

Three people — a grandfather, his son, and his grandson. The grandfather and the son are the two fathers; the son and the grandson are the two sons.

Mordisquitos · 2026-02-10T08:32:33 1770712353

Is the grandfather nobody's son?

only2people · 2026-02-10T08:28:11 1770712091

Any number between 2 and 4 is valid, so it's a really poor test, the machine cna never be wrong. Heck, maybe even 1 if we're talking someone schizophrenic. I got to wonder which answer YOU wanted to hear. Are you Jekyl or Hide?

bofadeez · 2026-02-11T08:05:11 1770797111

Lol that's powerful cope. Just follow up with "it's not a riddle" and you'll get the right answer.

kvirani · 2026-02-10T05:35:08 1770701708

I put it into AI and TIL about "gotcha arguments" and eristics and went down a rabbit hole. Thanks for this!

plagiarist · 2026-02-10T06:40:23 1770705623

"SOTA AI, to cross this bridge you must answer my questions three."

bofadeez · 2026-02-10T04:54:59 1770699299

We're all coming to terms with the fact that LLMs will never do complex tasks

bofadeez · 2026-01-29T21:35:35 1769722535

But as of now you're just wide open for abuse? Okay

Resend uses SES since it's almost impossible to get private IP mail to hit the inbox through ProofPoint filters. Looks like you have no idea about any of this. You don't even have knowledge of email reputation, much less a plan. Have you heard of Senderscore? You will have all zeros. Saying "SPF DKIM DMARC" is wild - that's a checklist from 15 years ago.

mhykim · 2026-01-29T22:18:07 1769725087

I think we’re aligned on the hard parts here, so let me be precise.

We’re not wide open for abuse nor are we bypassing the hard parts of email reputation. Quite the opposite. We also utilize SES's infrastructure and monitor reputation continuously, but we don’t assume SPF/DKIM/DMARC are sufficient on their own. They’re basics we have implemented, not the entire strategy.

You are correct private IPs per customer make sense once you’re sending meaningful volume (on the order of ~10k+/day per IP). But its inaccurate to say we are sending from a single private IP. IP pools are typically segmented by reputation and traffic profile for customers.

Reputation here is earned at multiple layers: per-IP, per-domain, per-inbox, and over time. We rate-limit, isolate, or revoke bad actors without poisoning unrelated senders. Hopefully this makes sense.

bofadeez · 2026-01-27T03:41:10 1769485270

Opus 4.5 is maybe 10% better than GPT 3.5. It's a total joke and these AI lab CEOs should literally be put in prison with Bernie Maddoff

bofadeez · 2026-01-27T03:35:36 1769484936

This bubble is going to crash so much harder than any other bubble in history. It's almost impossible to overstate the level of hype. LLMs are functionally useless in any context. It's a total and absolute scam.