More

Tenobrus · 2026-05-20T20:06:00 1779307560

what basis do you have for assuming an LLM is fundamentally incapable of doing this?

truncate · 2026-05-20T20:18:30 1779308310

What's your basis for assuming LLM is capable of doing this?

I honestly don't know personally either way. Based on my limited understanding of how LLMs work, I don't see them be making the next great song or next great book and based on that reasoning I'm betting that it probably wont be able to do whatever next "Descartes, Newton, Leibnitz, Gauss, Euler, Ramanujan, Galois" are going to do.

Of course AI as a wider field comes up with something more powerful than LLM that would be different.

EMM_386 · 2026-05-20T21:27:36 1779312456

"I don't see them be making the next great song"

Meanwhile, songs are hitting number one on some charts on Spotify that people think are humans and are actually AI. And Spotify has to start labelling them as such. One AI "band" had an entire album of hits.

Also - music is a subjective. Mathematics isn't.

And in this case, an LLM discovered a new way to reason about a conjecture. I don't know how much proof is needed - since that is literally proof that it can be done.

truncate · 2026-05-20T21:58:24 1779314304

>> Meanwhile, songs are hitting number one on some charts on Spotify that people think are humans and are actually AI. And Spotify has to start labelling them as such. One AI "band" had an entire album of hits.

There is quite some questions around that. Music is subjective and obviously different people have different taste, but I wouldn't call any of them to be actual good music / real hits.

>> LLM discovered a new way to reason about a conjecture

I wasn't questioning LLMs ability to prove things. Parent threads were talking about building new kind of maths , or approaching it in a creative/artistic way. Thats' what I was referring to.

I can't speak for maths of hard science as I'm not trained in that, but the creativity aspect in code is definitely lacking when it comes to LLMs. May not matter down the line.

dist-epoch · 2026-05-20T21:20:22 1779312022

LLMs are already making the next great songs. Just check out the Billboard charts.

truncate · 2026-05-20T22:00:54 1779314454

I'm sorry, I don't consider them "great songs". Obviously, different people have different taste.

blueone · 2026-05-20T20:21:40 1779308500

> what basis do you have for assuming an LLM is fundamentally incapable of doing this?

because I have no basis for assuming an LLM is fundamentally capable of doing this.

sswatson · 2026-05-20T20:35:21 1779309321

Good on you for spelling out this reasoning, but it is manifestly unsound. For a wide variety of values of X, people a few years ago had no reason to expect that LLMs would be capable of X. Yet here we are.

TheOtherHobbes · 2026-05-20T20:41:09 1779309669

In 1989, Gary Kasparov said that it was "ridiculous!" to suggest a computer would ever beat him at chess.

"Never shall I be beaten by a machine!”

In 1997 he lost to Deep Blue.

FartyMcFarter · 2026-05-20T20:52:36 1779310356

Yeah, and back then people moved the goal posts too, saying Deep Blue was just "brute-forcing" chess (which isn't even true since it's not a pure minimax search).

bananaflag · 2026-05-20T22:01:03 1779314463

Deep Blue was brute forcing chess in the sense that AlphaGo wasn't brute forcing Go.

FartyMcFarter · 2026-05-22T01:32:16 1779413536

Both of them contained a search algorithm that explored some moves from each considered position, usually not all moves. Both of them contained logic (learned or programmed) to evaluate moves and/or positions.

The differences between them are many, but brute force doesn't enter into it in either case.

Applejinx · 2026-05-21T00:05:56 1779321956

And today he's got salient observations on politics which hold much of his attention, and Deep Blue is shut off and has done nothing further.

Not a good argument for turning everything over to the Deep Blues. What's Deep Blue done for me lately?

zardo · 2026-05-20T21:07:46 1779311266

This is something that could be demonstrated rather than just argued.

Train an LLM only on texts dated prior to Newton and see if it can create calculus, derrive the equations of motion, etc.

If you ask it about the nature of light and it directs you to do experiments with a prism I'd say we're really getting somewhere.

gjm11 · 2026-05-20T22:15:41 1779315341

We tried this experiment with humans, back in the 17th century, and only a few[1] out of millions managed it given a whole human lifetime each.

[1] Obviously Newton counts as one. Leibniz like Newton figured out calculus. Other people did important work in dynamics though no one else's was as impressive as Newton's. But the vast majority of human-level intelligences trained on texts prior to Newton did not create calculus or derive the equations of motion or come close to doing either of those things.

davebren · 2026-05-21T02:58:34 1779332314

Newton did it at 23 and there would have been very few people with mathematical training. The LLM would be trained on the entirety of recorded human knowledge and mathematics up to that point, and would get to use a lot more energy so it still has a massive material advantage over young Isaac. Yet I don't believe calculus would magically appear in its response.

necovek · 2026-05-21T05:09:21 1779340161

A good way to look at it is to compare it to today: LLMs are already trained and are operationalizing a lot more mathematical knowledge than any human, including experts.

Why are they not coming up with paradigm shift in knowledge expression/discovery like humans did back then?

Are we just not prompting them right?

famouswaffles · 2026-05-21T06:59:40 1779346780

LLMs have been trained on a lot more data than any single human (text wise at least) for years now and these sort of results have only been possible for the latest crop of models in the past few months. Models get better as they get better.

necovek · 2026-05-22T05:42:43 1779428563

The argument is whether models of today, suitably trained on pre-17th century data (if comparable quantity was available) would be able to "invent" calculus et cetera.

If we believe today's models are sufficiently capable to have been able to do so, why are we not getting these types of results today compared to the entire world knowledge and especially math?

Are research mathematicians simply not prompting LLMs in the right way?

pickleRick243 · 2026-05-20T20:42:49 1779309769

Except this has been said since the 2010's and has been proven wrong again and again. Clearly the theory that LLM's can't "extrapolate" is woefully incomplete at best (and most likely simply incorrect). Before the rise of ChatGPT, the onus was on the labs to show it was plausible. At this point, I think the more epistemologically honest position is to put the burden back on the naysayers. At the least, they need to admit they were wrong and give a satisfactory explanation why their conceptual model was unable to account for the tremendous success of LLM's and why their model is still correct going forward. Realistically, progress on the "anti-LLM" side requires a more nuanced conceptual model to be developed carefully outlining and demonstrating the fundamental deficiencies of LLMs (not just deficiencies in current LLMs, but a theory of why further advancements can't solve the deficiencies).

Incidentally, similar conversations were had about ML writ large vs. classical statistics/methods, and now they've more or less completely died down since it's clear who won (I'm not saying classical methods are useless, but rather that it's obvious the naysayers were wrong). I anticipate the same trajectory here. The main difference is that because of the nature of the domain, everyone has an opinion on LLM's while the ML vs. statistics battle was mostly confined within technical/academic spaces.

davebren · 2026-05-21T03:08:02 1779332882

> Clearly the theory that LLM's can't "extrapolate" is woefully incomplete at best (and most likely simply incorrect).

What example is there where an LLM has extrapolated? All I've seen is a data set so large and an extra decomposition process making it so interpolation feels like extrapolation if you don't look close enough.

> but a theory of why further advancements can't solve the deficiencies

How about LeCun's?

dvt · 2026-05-20T20:09:37 1779307777

Because by definition LLMs are permutation machines, not creativity machines. (My premise, which you may disagree with, is that creativity/imagination/artistry is not merely permutation.)

fnordpiglet · 2026-05-20T20:20:43 1779308443

I prefer to think of it as they’re interpolation machines not extrapolation machines. They can project within the space they’re trained in, and what they produce may not be in their training corpus, but it must be implied by it. I don’t know if this is sufficient to make them too weak to create original “ideas” of this sort, but I think it is sufficient to make them incapable of original thought vs a very complex to evaluate expected thought.

drdeca · 2026-05-21T01:03:04 1779325384

People keep saying this, but if you try to interpret this at all literally, it just doesn’t work. Like, it’s phrased like it should have a precise meaning, right? Like, people even mention convex hulls when talking about it.

But if you actually try to take a convex hull of, some encoding of sentences as vectors? It isn’t true. The outputs are not in the convex hull of the training data.

I guess it’s supposed to be a metaphor and not literal, but in that case it’s confusing. Especially seeing as there are contexts in machine learning where literal interpolation vs literal extrapolation, is relevant. So, please, find a better way to say it than saying that “it can only interpolate”?

Muromec · 2026-05-21T14:59:40 1779375580

If it's all just points in the multidimensional space, why would the thing be restricted to some operations and not others. I'm not buying the argument

drdeca · 2026-05-21T22:27:32 1779402452

Sorry, I don't understand what you mean. Are you agreeing or disagreeing with me?

If it can only interpolate in a literal sense, that means that it only produces good outputs on convex combinations of inputs that appear in the training set. That's what interpolation means. But, if you take the embedding vectors of sentences/prompts, and then take the convex hull of these, it is not typical for new sentences not in the training set to have its embedding vectors be in the convex hull of these.

fnordpiglet · 2026-05-23T03:00:00 1779505200

I’m not sure I follow your end to end reasoning. In an n dimensional space interpolation along and within the convex hull is pretty much what they’re doing. How can it possibly not be? How would it interpolate a point that’s not within its vector space? Yes, it’s very complex with non linear transformations and a very high dimensionality, and residuals and other features create more complexity in the shape of the hull. But an LLM can not infer a concept to which it has no information channel. That’s clearly nonsense. The fact that they do bounded, learned, nonlinear compositional generalizations over a representational space induced by training -is by nature interpolation- not extrapolation. I’m sorry, but I believe their immense power has you confusing math with magic.

lukol · 2026-05-20T20:23:18 1779308598

This "new math" might be a recombination of things that we already know - or an obvious pattern that emerges if you take a look at things from a far enough distance - or something that can be brute-forced into existence. All things LLMs are perfectly capable of.

In the end, creativity has always been a combination of chance and the application of known patterns in new contexts.

dvt · 2026-05-20T20:27:26 1779308846

> This "new math" might be a recombination of things that we already know

If you know anything about the invention of new math (analytic geometry, Calculus, etc.), you'd know how untrue this is. In fact, Calculus was extremely hand-wavy and without rigorous underpinnings until the mid 1800s. Again: more art than science.

jfyi · 2026-05-20T20:51:37 1779310297

Newton and Leibniz were "hand-waving"?

If anything, they were fighting an uphill battle against the perception of hand-waving by their contemporaries.

dehsge · 2026-05-20T22:54:06 1779317646

It’s not that. Consider the definition of the limit. The idea existed for a long time. Newton/Leibniz had the idea.

That idea wasn’t formally defined until 134 years later with epsilon-delta by Cauchy. That it was accepted. (I know that there were an earlier proofs)

There’s even arguments that the limit existed before newton and lebnitz with Archimedes' Limits to Value of Pi.

Cauchy’s deep understanding of limits also led to the creation of complex function theory.

These forms of creation are hand-wavy not because they are wrong. They are hand wavy because they leverage a deep level of ‘creative-intuition’ in a subject.

An intuition that a later reader may not have and will want to formalize to deepen their own understanding of the topic often leading to deeper understanding and new maths.

dvt · 2026-05-20T21:05:16 1779311116

> Newton and Leibniz were "hand-waving"?

Yes, and it's pretty common knowledge that Calculus was (finally) formalized by Weierstrass in the early 19th century, having spent almost two centuries in mathematical limbo. Calculus was intuitive, solved a great class of problems, but its roots were very much (ironically) vibes-based.

This isn't unique to Newton or Leibniz, Euler did all kinds of "illegal" things (like playing with divergent series, treating differentials as actual quantities, etc.) which worked out and solved problems, but were also not formalized until much later.

anthk · 2026-05-20T22:10:27 1779315027

Euclid tells me otherwise. Rules, no art, no bullshit. Rules. Humanities people somehow never get it. Is not about arithmetics.

Vibe-what? Vibe-bullshit, maybe; cathedrals in Europe and such weren't built by magic. Ditto with sailing and the like. Tons of matematics and geometry there, and tons of damn axioms before even the US existed.

Heck, even the Book of The Games from Alphonse X "The Wise" has both a compendia of game rules and even this https://en.wikipedia.org/wiki/Astronomical_chess where OFC being able on geometry was mandatory at least to design the boards.

On Euclid:

https://en.wikipedia.org/wiki/Euclid%27s_Elements

PD: Geometry has tons of grounds for calculus. Guess why.

jfyi · 2026-05-20T21:14:15 1779311655

I think that I just take issue with the term "hand-waving" as equated to intuition. Yeah it lacked formal rigor, but they had a solid model that applied in detail to the real world. That doesn't come from just saying, "oh well, it'll work itself out". I guess if you want to call that "hand-wavy" we'll just have to disagree.

anthk · 2026-05-21T06:28:35 1779344915

Euclid disproves every bullshit posted by LL Mediocres unable to understand that before Calculus there were proto-calculus based ideas such as Zeno's paradoxes and some writtings from Archimede which pretty much are Calculus 0.9.

Americans and British geeks/nerds are blinded down by Newton unable to realize that there was tons of previous work since the Greek and in Middle Ages, where the British love to depict as brutish people with no culture at all.

And the case is that they weren't dumb at all and without Euclid and Archimede there woudn't be any Calculus.

https://en.wikipedia.org/wiki/Euclid%27s_Elements

https://en.wikipedia.org/wiki/Method_of_exhaustion

baq · 2026-05-20T20:41:10 1779309670

And yet nowadays you can restate all of it using just combinations of sets of sets and some logic operators.

nh23423fefe · 2026-05-20T20:13:08 1779307988

god of the gaps

iwontberude · 2026-05-20T21:43:31 1779313411

non overlapping magisteria

satvikpendem · 2026-05-20T20:39:22 1779309562

What is creativity if not permutation? A brain has some model of the world and recombines concepts to create new concepts.

d3ffa · 2026-05-20T21:12:09 1779311529

[flagged]

rowanG077 · 2026-05-20T21:33:03 1779312783

This is really not an acceptable reply. How about actually engaging with the point the commenter made instead of stamping your foot and throwing a tantrum.

anthk · 2026-05-20T22:11:33 1779315093

Innovation it's just another word for the term 'enhanced copy'. Everything it's a copy, except for nature.

KoolKat23 · 2026-05-20T20:12:32 1779307952

It pretty much is, otherwise it is randomness or entropy.

lajamerr · 2026-05-20T20:21:22 1779308482

LLMs by themselves are not able to but you are missing a piece here.

LLMs are prompted by humans and the right query may make it think/behave in a way to create a novel solution.

Then there's a third factor now with Agentic AI system loops with LLMs. Where it can research, try, experiment in its own loop that's tied to the real world for feedback.

Agentic + LLM + Initial Human Prompter by definition can have it experiment outside of its domain of expertise.

So that's extending the "LLM can't create novel ideas" but I don't think anyone can disagree the three elements above are enough ingredients for an AI to come up with novel ideas.

awesome_dude · 2026-05-20T20:39:19 1779309559

You're proving the GP's argument - LLMs aren't creative you say as much, it's the driving that is the creative force

lajamerr · 2026-05-20T20:48:45 1779310125

You can tell an agentic system. "Go and find a novel area of math that has unresolved answers and solve it mathematically with verified properties in LEAN. Verify before you start working on a problem that no one has solved this area of math"

That's not creative prompt. That's a driving prompt to get it to start its engine.

You could do that nowadays and while it may spend $1,000 to $100,000 worth of tokens. It will create something humans haven't done before as long as you set it up with all its tool calls/permissions.

awesome_dude · 2026-05-20T23:10:25 1779318625

Let me know when the Fields medal arrives in the mail.

It won't because even though it looks clever to you, people who /do/ understand math and LLMs understand that LLMs /are/ regurgitating

Why does your LLM need you to tell it to look in the first place? Why isn't just telling us all the answers to unsolved conjectures known and unknown?

Why isn't the LLM just telling us all the answers to all the problems we are facing?

Why isn't the LLM telling us, step by step with zero error, how to build the machine that can answer the ultimate question?

astrange · 2026-05-20T23:40:42 1779320442

Here's a Fields Medalist commenting who doesn't seem to believe that.

https://x.com/wtgowers/status/2057175727271800912

awesome_dude · 2026-05-20T23:53:28 1779321208

Um - all I see is

> Timothy Gowers @wtgowers

> @wtgowers

> If you are a mathematician, then you may want to make sure you are sitting down before reading further.

If your refutation requires someone to have an account, login, and read something - it's meaningless

defrost · 2026-05-20T23:59:43 1779321583

Try https://xcancel.com/wtgowers/status/2057175727271800912

it's readable to most, it's annoying having to swamp through ex-Twitter .. but there are work around's.

awesome_dude · 2026-05-21T00:11:28 1779322288

Thanks - I'll read that and the above linked OpenAI PR

But, I remain sceptical

defrost · 2026-05-21T00:28:43 1779323323

The (linked by OpenAI) comment paper by various tangential mathematicians was the most interesting read from my PoV:

https://cdn.openai.com/pdf/74c24085-19b0-4534-9c90-465b8e29a...

it includes the longer remarks by Gowers & others.

astrange · 2026-05-21T00:08:11 1779322091

https://cdn.openai.com/pdf/74c24085-19b0-4534-9c90-465b8e29a...

charlie90 · 2026-05-20T22:00:15 1779314415

I believe when we have AI Agents "living" 24/7, they will become creative machines. They will test ideas out their own ideas experimentally, come across things accidentally, synthesize new ideas.

We just haven't let AI run wild yet. But its coming.

awesome_dude · 2026-05-20T23:13:31 1779318811

So are self-driving cars - as they have been for the last... decade or so

AGI has been "just over the horizon" for literal decades now - there have been a number of breakthroughs and AI Winters in the past, and there's no real reason to believe that we've suddenly found the magic potion, when clearly we haven't.

AI right now cannot even manage simple /logic/

Barbing · 2026-05-20T20:46:15 1779309975

If that’s a requirement, aren’t LLMs driven by pretraining which was human driven?

Who decides at which the last point it’s OK to provide text to the model in order to be able to describe it as creative? (non-rhetorical)

Tenobrus · 2026-05-04T03:34:37 1777865677

o1 has a METR time horizon of around 40 minutes, opus 4.7 has an implied horizon of 18 hours based on its ECI score. this study is on a model that's several generations behind wrt the kind of tasks it can complete. it would be shocking if this number were anywhere near as low with GPT 5.5, to the point it seems nearly totally irrelevant to talk about these results

Tenobrus · 2026-03-29T20:57:04 1774817824

"Japanese soldier who kept fighting 29 years after World War 2"

cedws · 2026-03-29T21:02:22 1774818142

I watched a talk from Bjarne Stroustrup at CppCon about safety and it was pretty second hand embarrassing watching him try to pretend C++ has always been safe and safety mattered all along to them before Rust came along.

einpoklum · 2026-03-29T21:52:23 1774821143

Well, there has been a long campaign against manual memory management - well before Rust was a thing. And along with that, a push for less use of raw pointers, less index loops etc. - all measures which, when adopted, reduce memory safety hazards significantly. Following the Core Guideliness also helps, as does using span's. Compiler warnings has improved, as has static analysis, also in a long process preceding Rust.

Of course, this is not completely guaranteed safety - but safety has certainly mattered.

cedws · 2026-03-29T22:54:28 1774824868

>Following the Core Guideliness also helps

Yes, this what Stroustrup said and it makes me laugh. IIRC he phrased with a more of a 'we had safety before Rust' attitude. It also misses the point, safety shouldn't be opt-in or require memorising a rulebook. If safety is that easy in C++ why is everyone still sticking their hand in the shredder?

einpoklum · 2026-03-30T12:34:13 1774874053

You're "moving the goal posts" of this thread. Safety has mattered - in C++ and in other languages as well, e.g. with MISRA C.

As for the Core Guidelines - most of them are not about safety; and - they are not to be memorized, but a resource to consult when relevant, and something to base static analysis on.

Tenobrus · 2026-02-26T23:31:20 1772148680

those two stipulations were always their only ones, and they were included explicitly in their original contract with the DoW.

Tenobrus · 2026-01-11T07:06:35 1768115195

some ai detectors work now. pangram detects this as 57% AI written, and the parts it thinks are human are.... the ascii diagrams / screenshots. all the actual text it detects as generated.

Tenobrus · 2026-01-10T02:46:48 1768013208

strongly think you should go read the thread to get a sense of the level of expertise and amount of effort put in by the humans involved: https://www.erdosproblems.com/forum/thread/728#post-2852

Tenobrus · 2025-12-24T01:25:04 1766539504

ai detectors are never totally accurate but this one is quite good and it suggests something like 80% of this article is llm generated. honestly idk how you didn't get that just by reading it tho, maybe you haven't been exposed to much modern llm-generated content?

https://www.pangram.com/history/5cec2f02-6fd6-4c97-8e71-d509...

Tenobrus · 2025-08-13T22:23:21 1755123801

A lot of people forget how whimsical and strange and beautiful the old smaller GPT models and the original 3 base model pre-RHLF could be. Nowadays hundreds of millions of people have talked to heavily assistant tuned 4 or 5, but comparatively very few people have ever even seen GPT-1 outputs. It's cheap to run so I threw up a simple interface + single server hosting it.

Tenobrus · on March 29, 2024

It looks like the person who added the backdoor is in fact the current co-maintainer of the project (and the more active of the two): https://tukaani.org/about.html

kzrdude · on March 30, 2024

In various places they say Lasse Collin is not online right now, but he did make commits a week ago https://git.tukaani.org/?p=xz.git;a=summary

kzrdude · on March 29, 2024

Makes me wonder if he's an owner of the github organization, and what happens with it now?

Tenobrus · on Feb 22, 2024

it is very clear to me that humans do in fact have a recursive self-improvement ability, and i'm confused why you think otherwise

astrange · on Feb 22, 2024

I think people can read books (self improvement) and have children (recursive), but neither of those are both.

lucubratory · on Feb 22, 2024

Why do you think that the human population is more intelligent, knowledgeable, and achieves greater technological feats as time goes on? It's because of recursive self-improvement, we are raised and educated into being better in a quite general sense, which includes being better at raising and educating; nearly every generation this cycle repeats and has for all of human history, at least since we acquired language. We also build machines that help us to make better machines, and then we use those better machines to make even better machines, another example of recursive self-improvement.

webmaven · on Feb 24, 2024

You're pointing out that groups/institutions/cultures/civilizations are examples of recursively self-improving entities, but the original point was about a recursively self-improving individual intelligent entity.

Well, to the extent that a human-level intelligence is an individual, anyway. We ourselves are probably a mixture-of-experts in some sense.

lucubratory · on Feb 24, 2024

An individual human starts out a mewling baby and can end up a maxillofacial surgeon through at least partial examples of recursive self-improvement. Learn to walk, talk, read, write, structure, argue, essay, study, cite etc all the way through to the end, with what you previously learned allowing you to learn even more. There's a huge amount of outside help, but at least some of it is also self-improvement.

Also, for the purposes of talking about the phenomenon of recursive self-improvement, individual vs society isn't the end of analysis. Part of the reason AI recursive self-improvement is concerning is that people are worried about it happening on much faster than societal timescales, in ways that are not socially tractable like human societies are (e.g. if our society is "improving" in a way we don't like, we or other humans can intervene to prevent, alter, or mitigate it). It's also important to note that when we're talking about "recursive self-improvement" when it comes to AI, the "self" is not a single software artifact like Llama-70B. The "self" is AI in general, and the most common proposed mechanism is that an AI is better than us at designing and building AIs, and the resulting AI it makes us even better at designing and building AIs.

rralian · on Feb 22, 2024

New generations build onto the scientific knowledge of previous generations. It may not be fast but that sounds like recursive improvement to me. It seems reasonable for AI to accelerate this process.

astrange · on Feb 22, 2024

I think saying all of society is doing it is plausible, but not the same thing as a single human or AI doing it.

Though… still don't think it's true. Isn't "society is self improving" what they call Whig history?

killerstorm · on Feb 22, 2024

AI might have multiple instances within a single computing environment, so it's more like a population than a single individual.

I.e. "You can only use the memory which you currently use" would be a weird artificial constraint not relevant in practice.

spacecadet · on Feb 22, 2024

A very small percentage maybe. I think I agree with the notion that most people bias toward thinking they are improving while actually self-sabotaging.