More

thepasch · 2026-06-13T12:13:56 1781352836

For the same reason this website describes GPT-OSS 120b as "the workhorse" and thinks Gemma 3 should be the vision model to use here: because it's vibeslopped garbage not a single human has ever laid a singular eye on, designed to capitalize off of current headlines and scam clueless marks into whatever nonsense it is they're peddling.

Thinking GPT-OSS still has any relevance at all in the open weights community is a hallmark tell for unvetted AI slop, because that's about where the OSS SOTA was at the time all the current frontier models had their knowledge cutoff.

thepasch · 2026-06-13T10:02:34 1781344954

MiniMax and Moonshot both literally just released the weights for their latest flagship models, a few weeks after DeepSeek did the same. One lab a pattern does not make.

thepasch · 2026-06-13T09:56:57 1781344617

> How is that going to help him? "Our models are so inferior they are not deemed a threat unlike anthropic's"?

Do you honestly think that this - logic and reason - is going to stop anyone from hyping whatever nonsense he comes up with to the moon and back anyway? Right after the SpaceX IPO of all things?

thepasch · 2026-06-13T09:22:15 1781342535

> They're not close to Opus or the latest GPT yet

Disagreed. GLM-5.1 is easily as good as Opus 4.5 for all the coding purposes I could throw at it, which is the model that kicked this entire hype cycle into overdrive in the first place.

thepasch · 2026-06-12T03:31:05 1781235065

Am I crazy to be extremely suspicious about the fact that this heavily security-focused task suite didn't trigger a single of the infamously hilariously overparanoid guardrails? This, along with the fact that the model "cheated" by scouring the git history for an upstream fix and implemented byte-perfect replications of existing fixes without prior exploration makes me wonder whether both the model itself and the security classifiers are tuned to act very differently when they detect that the model is being benchmarked. I can think of few to no other plausible explanations for this sort of behavior.

May be a bit tin-foil, but...

thepasch · 2026-06-10T21:34:22 1781127262

As per usual in situations like these, one must look at the actions in order to assess whether there's any worth in the words. And the actions of Anthropic have, by and large, been steering hard towards establishing a walled garden, empowering corporations over consumers, pushing for regulatory capture under the guise of national security, and consolidating as much power as possible within Anthropic and no one else.

He is certainly skilled at writing philosophical essays that sound like they make cogent and thoughtful points (and sometimes genuinely do make cogent and thoughtful points), but his company's actions disregard his rhetoric at their best and actively contradict it at their worst. For instance: there was zero pressure on Anthropic to release this model to anyone - they were ostensibly in the lead, which is the exact scenario they said they'd hold back model releases back when they axed their safety policy the instant it came under the slightest amount of economic pressure:

> And it promises to “delay” Anthropic’s AI development if leaders both consider Anthropic to be leader of the AI race and think the risks of catastrophe to be significant. https://time.com/7380854/exclusive-anthropic-drops-flagship-...

Yet this essay proposes this extreme auditing and regulatory administration pipeline that new models are supposed to go through before they release, right after they, themselves, under no pressure, ran a months-long marketing campaign under apocalyptic rhetoric, which they continue to harp on to the point of nerfing/auto-downgrading their model into uselessness for many legitimate tasks that older models had absolutely no issue supporting, while the supposedly extremely dangerous version... can be freely used with no guardrails by their corporate partners.

The hypocrisy here is neither difficult to see nor is it particularly sophisticated, which makes it all the more infuriating.

thepasch · 2026-06-10T21:22:23 1781126543

Working to keep a roof over the head of yourself and those you love is a necessity. It can become an identity if you enjoy what you do, sure, but that is not a given for, I'd say, a big majority of the workforce, globally.

slyzmud · 2026-06-10T23:18:48 1781133528

I agree with you, 99% of the people work just to pay bills, but that doesn't make the other part false.

I'm a software engineer and love thinking about problems methodically. Every time I hear a someone saying that programmers are no longer required (even if I don't agree with that) if feels really bad, it's equivalent to saying that what I do best in life has no value anymore.

To put it on other words: I really like philosophy, but what value do they provide in modern world? Who pays for the work of a philosopher? I think people will start of thinking of programmers like that eventually.

poslathian · 2026-06-11T01:34:38 1781141678

I’m lucky to have more than my share of really exceptional programmers to hang out with and they all say the same thing: “I haven’t been writing code for months and don’t expect to again”

This is a way different sentiment than “programmers aren’t needed anymore” - I’m just seeing ambition, motivation, and fun go up in lockstep.

I first heard this in November and slowly one by one it’s everyone whose opinion I respect.

FWIW the other popular topic is how abysmally stupid and limited these amazing tools continue to be, despite also being magic.

Oh and that none of us have gotten token maxxing to succeed, despite lots of trying.

besterman23 · 2026-06-10T23:27:39 1781134059

You’re arguing to a subset of people who have made work their entire life and have retroactively justified their sacrifices with thoughts such as high compensation means what I do is socially valuable. However, at the same time they work at Meta or something making internal tools to make product developers 5% more efficient at tweaking the addiction algorithm to gain 0.2% more screen-time per user.

thepasch · 2026-06-10T07:47:00 1781077620

> Yeah I think there are ways to know, ways involving less dependence on a LLM.

This kills the entire value prop of using LLMs as research accelerators, though.

thepasch · 2026-06-09T19:28:54 1781033334

What it feels like to work with Fable:

> Switched to Opus 4.8: Fable 5 has safety measures that flag messages on most cybersecurity or biology topics. They may flag safe, normal content as well. These measures let us bring you Mythos-level capability in other areas sooner, and we're working to refine them. Send feedback or learn more.

matheusmoreira · 2026-06-09T21:41:18 1781041278

Same experience here. The parts of my project that actually could have benefited from Fable's code review got this instead.

thepasch · 2026-06-09T19:03:27 1781031807

Yeesh. Anthropic's paranoia about China is starting to get pathological.