Hacker Newsnew | past | comments | ask | show | jobs | submit | bofadeez's commentslogin

Lol this is so naive and optimistic. Claude will just do whatever it wants and apologize later. This is good for action #1 though.

They still wear crowns?


LLMs are designed to fool you into thinking they're right by providing plausible answers.

Stop anthropomorphizing intermediate tokens as "reasoning" when all it can do is rationalize.

E.g. "This test script failed but probably for an unrelated reason. I'll mark it done and move on."


The point of the "reasoning" is to generate ideas that might get it unstuck. Ignoring "irrelevant" stuff is one way of getting unstuck.


One agent can't even be trusted to think autonomously much less a tree of them


Trust is not objective. It's built between parties over time by looking at actions and the results of those actions. In other words, it's entirely subjective based on what's happened between the parties involved. You haven't built that trust with AI agents, or the agents have done things to lose that trust (assuming you've tried), but others have. You can't just dismiss their experience as invalid compared to your own.


Agree



Ask any SOTA AI this question: "Two fathers and two sons sum to how many people?" and then tell me if you still think they can replace anything at all.


What answer do you expect here? There's four people referenced in the sentence. There's more implied because of Mothers, but if you're including transient dependencies, where do we stop?


It can also be 3 people, as one person can be a father and a son at the same time. If you allow non-mentioned people to be included in the attribute (i.e. the sons of the fathers are not part of the 2) it could also be 2 people, as long as they are fathers.


Just follow up with "it's not a riddle" and the LLM will answer your question.


If you force it to use chain-of-thought: "Two fathers and two sons sum to how many people? Enumerate all the sets of solutions"

"Assuming the group consists only of “the two fathers and the two sons” (i.e., every person in the group is counted as a father and/or a son), the total number of distinct people can only be 3 or 4.

Reason: you are taking the union of a set of 2 fathers and a set of 2 sons. The union size is 2+2−overlap, so it is 4 if there’s no overlap and 3 if exactly one person is both a father and a son. (It cannot be 2 in any ordinary family tree.)"

Here it clearly states its assumption (finite set of people that excludes non-mentioned people, etc.)

https://chatgpt.com/share/698b39c9-2ad0-8003-8023-4fd6b00966...


Then you'll ask it to evaluate the possible solutions and it will forget the original problem entirely by the time it's done enumerating solutions.

Great job, AI labs! It's almost TOO useful


Every father is a son to somebody...


This is undefined. Without more information you don’t know the exact number of people.

Riddle me this, why didn’t you do a better riddle?


Person 1: "I need chairs for two fathers and two sons to sit"

Person 2: 'Okay, I have no idea how many chairs to grab, not enough information' - nobody ever

(Person 2 has no ability to contribute to anything of economic value.)


Anyone who talks like person 1 contributes negative economic value.


No sounds like a normal person lol. Just ask an LLM why I'm right and you're wrong. You're welcome.


No, but you can establish limits, like the total set of possible solutions.


I just did. It gave me two correct answers. (And it's a bad riddle anyway.)


Oh you forgot to say "it's not a riddle" and then get the right answer lol


GPT-5 mini:

Three people — a grandfather, his son, and his grandson. The grandfather and the son are the two fathers; the son and the grandson are the two sons.


Is the grandfather nobody's son?


Any number between 2 and 4 is valid, so it's a really poor test, the machine cna never be wrong. Heck, maybe even 1 if we're talking someone schizophrenic. I got to wonder which answer YOU wanted to hear. Are you Jekyl or Hide?


Lol that's powerful cope. Just follow up with "it's not a riddle" and you'll get the right answer.


I put it into AI and TIL about "gotcha arguments" and eristics and went down a rabbit hole. Thanks for this!


"SOTA AI, to cross this bridge you must answer my questions three."


We're all coming to terms with the fact that LLMs will never do complex tasks


But as of now you're just wide open for abuse? Okay

Resend uses SES since it's almost impossible to get private IP mail to hit the inbox through ProofPoint filters. Looks like you have no idea about any of this. You don't even have knowledge of email reputation, much less a plan. Have you heard of Senderscore? You will have all zeros. Saying "SPF DKIM DMARC" is wild - that's a checklist from 15 years ago.


I think we’re aligned on the hard parts here, so let me be precise.

We’re not wide open for abuse nor are we bypassing the hard parts of email reputation. Quite the opposite. We also utilize SES's infrastructure and monitor reputation continuously, but we don’t assume SPF/DKIM/DMARC are sufficient on their own. They’re basics we have implemented, not the entire strategy.

You are correct private IPs per customer make sense once you’re sending meaningful volume (on the order of ~10k+/day per IP). But its inaccurate to say we are sending from a single private IP. IP pools are typically segmented by reputation and traffic profile for customers.

Reputation here is earned at multiple layers: per-IP, per-domain, per-inbox, and over time. We rate-limit, isolate, or revoke bad actors without poisoning unrelated senders. Hopefully this makes sense.


Opus 4.5 is maybe 10% better than GPT 3.5. It's a total joke and these AI lab CEOs should literally be put in prison with Bernie Maddoff


This bubble is going to crash so much harder than any other bubble in history. It's almost impossible to overstate the level of hype. LLMs are functionally useless in any context. It's a total and absolute scam.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: