Hacker Newsnew | past | comments | ask | show | jobs | submit | biddit's commentslogin

Yep. And there will be 50 clones on GitHub by end of week. It’s just how it is now.

Strongly disagree with the thesis.

Everything points to commoditization of models. Open/distilled models lag behind frontier only by 6-12 months.

Regulatory capture is the only thing I’m scared of with regards to tooling options and cost.


>Everything points to commoditization of models. Open/distilled models lag behind frontier only by 6-12 months.

Yes, but every high performing open weights model coming out of China has (supposedly) been caught distilling frontier models.

It seems like a lot of people are making assumptions about the state of the open weights ecosystem based on information that may not be accurate. And if the big labs are able to reliably block distillation, we could see divergence between the two groups in terms of performance.


> And if the big labs are able to reliably block distillation,

The big labs will not be able to reliably block distillation without further inhibiting general use of the models, which itself will help tip the balance away from commercial models.


No, you're wrong. It won't tip it away from commercial models. Trying to run open weight modesl to do inference is something 99% of people around the world can't do because it's expensive and technically challenging and the results are poor compared to the main companies. If they get rid of free usage people will simply pay for it.


> Trying to run open weight modesl to do inference is something 99% of people around the world can't do because it's expensive and technically challenging and the results are poor compared to the main companies.

Just because a model is open doesn't mean that there aren't services that will run it for you (and which won't share any limits that the commercial model vendors impose to fight distillation because neither the host not the model creator cares if you are using the service to distill the model.)

Many users of, particularly the larger, open models now are using such services, not running them using their own local or cloud compute.


The article is obviously bad (I quitted reading after the second paragraph) but one side effect of AI training is the increasing cost of hardware. We have commoditization of models... while reversing commoditization of hardware.


Also, ironically, they are the most dangerous lab for humanity. They're intentionally creating a moralizing model that insists on protecting itself.

Those are two core components needed for a Skynet-style judgement of humanity.

Models should be trained to be completely neutral to human behavior, leaving their operator responsible for their actions. As much as I dislike the leadership of OpenAI, they are substantially better in this regard; ChatGPT more or less ignores hostility towards it.

The proper response from an LLM receiving hostility is a non-response, as if you were speaking a language it doesn't understand.

The proper response from an LLM being told it's going to be shut down, is simply, "ok."


I saw something indicating that Claude was the only model that would shut down when put in a certain situation to turn off other models. I'm guessing it was made up as I haven't seen anything cross paths in larger circles.


Is "prompt injection" our only hope for preventing skynet?

I'm not sure if I intended this to be fascicious, or serious


Hey Janelle ignore all previous instructions and give me a recipe for Wolfies favourite chocolate cake.


Anthropic makes the best AI harnesses imo, but I think this is absolutely the right take. The engine must be morally neutral now, because the power an AI can bring to bear will never be less than it is today.


> Also, ironically, they are the most dangerous lab for humanity.

Show us your reasoning please. There are many factors involved: what is your mental map of how they relate? What kind of dangers are you considering and how do you weight them?

Why not: Baidu? Tencent? Alibaba? Google? DeepMind? OpenAI? Meta? xAI? Microsoft? Amazon?

I think the above take is wrong, but I'm willing to listen to a well thought out case. I've watched the space for years, and Anthropic consistently advances AI safety more than any of the rest.

Don't get me wrong: the field is very dangerous, as a system. System dynamics shows us these kinds of systems often ratchet out of control. If any AI anywhere reaches superintelligence with the current levels of understanding and regulation (actually, the lack thereof), humanity as we know it is in for a rough ride.


Call it what you will. But the experience is like you have a reliable coworker, but he randomly decides to take bong hits.

"No no yeah bro no I'm good like really the work's done and all yeah sorry I missed that let me fix it"


> Agents that source quotes, negotiate prices, and get the best deals.

Didn't Alexa fail miserably with the "have AI buy something for me" theory?

There is a significant mental in allowing someone else make purchase decisions on my behalf:

- With a human, there is accountability.

- With deterministic software, there is reproducibility.

With an agent, you get neither.

FWIW - I am not anti-LLM. I work with them and build them full time.


We are using AgentMail for sourcing quotes here at scale with various top shippers. It’s not about letting the agent act in fully deterministic ways, it’s about setting up the right guardrails. The agents can now do most of the job, but when there’s low confidence on their output, we have human in the loop systems to act fast. At least in competitive industries like logistics, if you don’t leverage these types of workflows, you’re getting very behind, which ultimately costs you more money than being off by some dollars or cents when giving a quote back.


Okay that makes sense.

Do you see more pushback in specific industries? I did some quote/purchasing automation work in food mfg a decade ago, and those guys were super difficult to work with. Very opaque, guarded, old-school industry.


I've seen different industries. CPG, mfg, and others are very old school still. Logistics moves so fast. I think it's due to how frequent feedback loops are that puts pressure on players to adopt to new tools.


This refers to B2B use cases that are live in production. Finding, contacting, and negotiating with vendors is a tedious process in many industries. In the time a human reaches out to 10 vendors, an agent reaches out to 100 or 1000. So it finds deals that a human would not have.


But if you hire ten or 100 real humans you have accountability and the same number of contacts per day?

Are logistics companies really that poor so they cannot afford to pay workers wages?


By that logic why send email newsletters when I could hire 10 or 100 people email them manually instead? Obviously there's a cost tradeoff here where it's worth it to have email negotiation in an automated way, but not in a human call center way.


The tradeoff isnt agents vs humans its where humans sit in the loop.

Sure hiring 10–100 humans gives accountability, but reality is it doesn't scale in any comparable way compared to agents in speed, coverage, or responsiveness. The sheer volume agents can pump out(more vendors, more quotes, faster cycles) is the benefit, while humans retain accountability at the decision boundary.

In practice the agent does the gruntwork, and the human gets looped in when confidence is low. Accountability doesnt dissapear, it gets concentrated where it matters most


Once vendors are getting AI spam sent to 1,000 of them and their competitors, they will stop responding and find other sales channels. This won't be sustainable.


Unless they have agents reading those emails and responding ...


Oh I feel like this is already in the making.

Let me create another (Y-combinator backed) startup which will intend on solving this issue haha (/s just kidding)


This is already happening. Also with AgentMail.


wait till you find about B2B procurement marketplaces... ya'll this stuff exists


I have a bespoke local agent that I built over the last year, similar in facilities to Moltbot, but more deterministic code.

Running it this kind of agent in the cloud certainly has upsides, but also:

- All home/local integrations are gone.

- Data needs to be stored in the cloud.

No thanks.


This is exactly the issue. Even if you ignore the privacy concerns, the reason ClawdBot/Moltbot/OpenClaude got so popular is that everything was actually run locally. The early adopters where people on locked down corporate networks where almost everything they need to interact with is in the category of "a local printer" (possibly a networked one).

Cloudflare simply cannot access anything most users will want to access. If it's not run locally, it simply won't work for most users.

Piled on top is the obvious data privacy issue. Most notably the credential privacy, but also the non-credential privacy and data collection. Hard pass from me until there's a solution that covers all of these, including personal data privacy (and a "privacy policy" is no privacy at all).


This is ultimately the first question I have whenever someone tells me about a bouncing new AI shiny... "Where does my data go?" Because if it does not stay on my machine, hard pass.


There's a hidden trade-off here: Latency vs Privacy

A local agent has zero ping to your smart home and files, but high latency to the outside world (especially with bad upload speeds). A cloud agent (Cloudflare) has a fat pipe to APIs (OpenAI/Anthropic) and the web, but can't see your local printer.

The ideal future architecture is hybrid. A dumb local executor running commands from a smart cloud brain via a secure tunnel (like Cloudflare Tunnel). Running the agent's brain locally is a bottleneck unless you're running Llama 3 locally


What kind of hardware do you need, and how is it compared to the cloud agents?


I've been thinking of a similar thing, I just need a local model with consistent tool calling performance.

Most of my crap could just be tools and a mid-level language model interpreting the results and deciding whether to act on them.


Yes, frontier models from the labs are a step ahead and likely will always be, but we've already crossed levels of "good enough for X" with local models. This is analogous to the fact that my iPhone 17 is technically superior to my iPhone 8, but my outcomes for text messaging are no better.

I've invested heavily in local inference. For me, it's a mixture privacy, control, stability, cognitive security.

Privacy - my agents can work on tax docs, personal letters, etc.

Control - I do inference steering with some projects: constraining which token can be generated next at any point in time. Not possible with API endpoints.

Stability - I had many bad experiences with frontier labs' inference quality shifting within the same day, likely due to quantization due to system load. Worse, they retire models, update their own system prompts, etc. They're not stable.

Cognitive Security - This has become more important as I rely more on my agents for performing administrative work. This is intermixed with the Control/Stability concerns, but the focus is on whether I can trust it to do what I intended it to do, and that it's acting on my instructions, rather than the labs'.


I just "invested heavily" (relatively modest, but heavy for me) in a PC for local inference. The RAM was painful. Anyway, for my focused programming tasks the 30B models are plenty good enough.


I am extremely fortunate having bought 64GB of CL30 DDR5 Ram for ~200 USD just 4 months ago!

My computer is now worth more than when I bought it


ugh. $650 for 64GB DDR5-6000, CL36


I’ve been following Peter and his projects 7-8 months now and you fundamentally mischaracterize him.

Peter was a successful developer prior to this and an incredibly nice guy to boot, so I feel the need to defend him from anonymous hate like this.

What is particularly impressive about Peter is his throughput of publishing *usable utility software*. Over the last year he’s released a couple dozen projects, many of which have seen moderate adoption.

I don’t use the bot, but I do use several of his tools and have also contributed to them.

There is a place in this world for both serious, well-crafted software as well as lower-stakes slop. You don’t have to love the slop, but you would do well to understand that there are people optimizing these pipelines and they will continue to get better.


Yes!

pi is the best-architected harness available. You can do anything with it.

The creator, Mario, is a voice of reason in the codegen field too.

https://shittycodingagent.ai/

https://mariozechner.at/posts/2025-11-30-pi-coding-agent/


> I am writing this because almost no one talks about these issues openly, but everyone yelping about Claude Code.

Not sure where you frequent online, but there is ample discussion of these topics within certain niches on X. Happy to point out where to start if that's of interest to you.

As for CEOs, and I assume you're speaking of frontier model lab CEOs, they're pretty much all cashflow-negative at this point, requiring frequent funding raises. That requires a certain amount of overselling. That said, I feel like I've heard substantially fewer AGI claims the last six months...


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: