Hacker Newsnew | past | comments | ask | show | jobs | submit | LPisGood's commentslogin

You don’t need to be sneaky. Just require all contributing PRs to say openclaw.

They’re also objectively not “unable” they are “unwilling” and hiding behind policies as if they are unalterable laws is silly.

How would you fire an agent? This impacts the company that makes the LLM, but not the agent itself.

Some Erdős problems are basically trivial using sophisticated techniques that were developed later.

I remember one of my professors, a coauthor of Erdős boasted to us after a quiz how proud he was that he was able to assign an Erdős problem that went unsolved for a while as just a quiz problem for his undergrads.


Worth mentioning, though, that people have already tried running all of them through LLMs at this point.

So this is proof of the models actually getting stronger (previous generations of LLMs were unable to solve this one).


Not definitively. LLMs are stochastic with respect to input, temperature and the exact prompt. It's possible that the model was already capable of it but never received the exact right conditions to produce this output.

Every model is able to solve each problem, given the right prompt. (Worst case, the prompt contains the solution.)

Interesting... Exhaustive brute force prompting might expose previously unknown capabilities in existing models. Seems like a whole can of worms.

Exhaustive brute force prompting is completely unfeasible. The number of potential prompts is impossibly large.

It "exhaustive brute forcing" approach does not need an LLM in the loop. Just brute force the possible outputs instead. They will contain all the most beautiful novels you can imagine!

> So this is proof of the models actually getting stronger (previous generations of LLMs were unable to solve this one).

No, it's not.

While I don't dispute that new models may perform better at certain tasks, the fact that someone was able to use them to solve a novel problem is not proof of this.

LLM output is nondeterministic. Given the same prompt, the same LLM will generate different output, especially when it involves a large number of output tokens, as in this case. One of those attempts might produce a correct output, but this is not certain, and is difficult if not impossible for a human not expert in the domain to determine this, as shown in this thread.


This is one of a number of such results achieved only in the last few months with only the last crop of models. They have undoubtedly gotten better in this domain. Saying anything else is just denial. You can run these same problems on GPT-4 or 5 all you want, you'll get nowhere. In fact people did, and you're hearing about it now because it's these crop of models that are getting meaningful results.

As others have pointed out, a key part of the prompt used here may have been "don't search the internet" as it would most likely have defaulted to starting off with existing approaches to that problem...

Minor aside, these models do not return the same answer every time you prompt it. Makes it harder to reason over their effectiveness.

You don't need to say "Minor aside" either. Thankfully language is a creative endeavour not a scientific one.

Context: parent originally said "you should not say 'worth mentioning', if it's worth mentioning you can just say it". That sentence has now been edited out so my comment looks weird.

Your reply was so rude it convinced me to edit. Your second reply is a distortion of my original message too.

Well I'm glad it had the desired effect. Your comment was ruder.

I disagree, you have quoted me in a way that is not the tone or content of what I wrote.

Tao mentions that the conventional approach for this problem seems to be a dead-end, but it’s apparently a super ‘obvious’ first step. This seems very hopeful to me — in that we now have a new approach line to evaluate / assess for related problems.

I think they would have a very strong case that using the mouse on a product is likely to confuse consumers about the origin of the product and therefore infringe on their trademark.

Nah, Disney seems to be genuinely letting it go. Amazon and other sites are flooded with Steamboat Willie merchandise at this point.

In fact I play cornhole competitively, and last year I picked up a set of Steamboat Willie themed bags:

https://www.logiccornhole.com/products/steamboat-willie-colo...


> Marvin Minsky "proved" that there was no way a perceptron-based network could do anything interesting

What result are you referring to?


Haven't read the page but a promising-looking search result is here: https://seantrott.substack.com/p/perceptrons-xor-and-the-fir...

I'm sure it's an oversimplification to blame the entire 1970s AI winter on Minsky, considering they couldn't have gotten much further than the proof-of-concept stage due to lack of hardware. But his voice was a loud, widely-respected one in academia, and it did have a negative effect on the field.


I suspect all Minsky did was reinforce what many people were already thinking. I experimented with neural nets in the late 80s and they seemed super interesting, but also very limited. My sense at the time was that the general thinking was, they might be useful if you could approach the number of neurons and connections in the human brain, but that seemed like a very far off, effectively impossible goal at the time.

When attackers can move laterally through everything because every internal tool leaks credentials and data there will be issues.

Internal tool Doesn’t have credential. Checkmate ;)

There is something to be said about scope creep here

I saw this video yesterday and considered posting it, but I wasn’t sure if it was appropriate for HN.

This channel has another video where it shows how the clean room lab is created starting from a basic backyard shed, and that was truly astounding. The positive pressure to keep the number of particles low in someone’s backyard is almost mystical to me.


If you haven't seen this one, I highly recommend it:

Indistinguishable From Magic: Manufacturing Modern Computer Chips

https://www.youtube.com/watch?v=NGFhc8R_uO4&t=2070s

It's quite old but I think there is no modern version of it.

I've tried posting to HN a few times but it hasn't gained traction for some reason, but I find it absolutely mind blowing.


Videos in general don’t get much traction here. Most of the time I don’t want to watch them in this context either, when other sites I do.

Maybe it’s just I come here for the old web feel when video was costly, rare and short.


Which is a pity, because lots of videos really need to be seen to be fully appreciated. Especially the ones showing stuff being made. And the ones that tend to show up here are usually worth the time.

I'm totally with the text folks on the 5 hour Fossdem sessions, though. Give me an accurate transcript I can grep or don't even bother.


There's something to it. I personally am happy to have one of these few precious places left where I can find content to read rather than watch.

Yeah, I understand and partially agree.

However I've discovered wonderful gems like this RAM video.


Video links are naturally gonna get less clicks from people scrolling HN at work :)

The whole process was deep magic to me before I watched that video. It didn't seem less magical after I've watched it. More so, if anything.

Asianometry[0] has a number of videos on EUV lithography that cover some of the mind-blowing advances in the years since.

Veritasium[1] recently also made a video on the subject.

[0] https://www.youtube.com/playlist?list=PLKtxx9TnH76RYHY7L1YzE...

[1] https://www.youtube.com/watch?v=MiUHjLxm3V0


I rewatch that every year as a reminder of what we're capable of. Great video.

Yep, me too. Still feels magical after all these years.

This was the one that did it for me: "38C3 - From Silicon to Sovereignty: How Advanced Chips Are Redefining Global Dominance" https://www.youtube.com/watch?v=NdppYYfQJgg

Absolutely insane stuff.


This one I did not know! Thanks for sharing

I think if it's interesting to you then it's worth posting, and letting the voting system do it's thing. I only rarely post because by the time I've seen something it's usually already been posted

> but I wasn’t sure if it was appropriate for HN

why even allow the cognitive overhead to worry about such a thing? it's not for you to decide anyway - let the users decide using the voting system that's built to task


Tbh this is exactly the sort of thing I'd come here to see

Recently I saw a post about Bonsai trees on the front page. Making your own RAM is 100% more relevant to HN than quite a few posts I see on the main page.

The HN crowd decides what is relevant

It's about intellectual curiosity, so it's both.

You’re not sure if someone building a RAM clean room in a shed is appropriate for HackerNews, literally “news for nerds”? A dictionary purchase may be warranted

I think he plans to go far beyond just making RAM in that clean room. This is pure speculation, but I suspect the goal of that channel is to just make doom from scratch.

Given that the shed in this guy’s backyard is already approaching the entire national technological output of any country in the 1970s I think he may get there.


In a comment he says he is doing it for some research into a related thing (somethiing to do with GaN sheets?)

Agree with the sentiment, but “news for nerds” is Slashdot.

Their standard is higher than that, "Stuff that matters."

Slashdot still exists?

Well, “exists” is a pretty broad spectrum.

I miss the comment tagging system: insightful, informative, interesting, funny. It would make sense for hn.

You forgot Troll, you insensitive clod!

"Score: 5, Troll" is the ultimate achievement.

To put it that into context, some tags count as upvotes, others count as downvotes, "Troll" is a downvote. So to have your post labelled as "Troll" with a positive score, it has to have enough upvotes to compensate the penalty from the "Troll" votes, but without having another tag dominate. 5 is the maximum score.

"Score: 5, Troll" is therefore the mark of a very successful troll.


They also have "Underrated" and "Overrated" which apply points but do not act as tags. So I guess the easiest way to get +5 Troll is to have many Troll and Underrated votes, if it works the way I think it does.

you must be new here

Yeah but it's a YouTube video. Those tend not to do super well on the front page.

It’s perfectly ok to submit links that don’t reach the front page!

Just gimme the transcripts for a speed read

There's no mention of AGI, climate change, AWS outages, Trump, crypto schadenfreude or my new MVP that you should totally sign up for even though I just vibe coded it 20 minutes ago and the DNS hasn't fully propagated yet, but the API is amazing plz like comment and subscribe.

Ok, maybe I'm being a bit cynical. Stories about bikeable cities are welcome too. And wasn't Soylent popular for a hot minute back in the day?


Yeah we need 20 more LLM submissions instead, that's the hard hitting content. /s

So what’s the point of having portable syntax, but not portable semantics?

C certainly gives the illusion of portability. I recall a fellow who worked on DSP programming, where chars and shorts and ints and longs were all 32 bits. He said C was great because that would compile.

I suggested to him that he'd have a hard time finding any existing C code that ran correctly on it. After all, how are you going to write a byte to memory if you've only got 32 bit operations?

Anyhow, after 20 years of programming C, I took what I learned and applied it to D. The integral types are specified sizes, and 2's complement.

One might ask, what about 16 bit machines? Instead of trying to define how this would work in official D, I suggested a variant of D where the language rules were adapted to 16 bits. This is not objectively worse than what C does, and it works fine, and the advantage is there is no false pretense of portability.


Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: