Hacker Newsnew | past | comments | ask | show | jobs | submit | ACCount37's commentslogin

No one. The usual.

Wrong or incomplete?

The current findings seem consistent with "both plaques and tangles are significant components of the pathology" and "our interventions are typically late and the accumulated neurological damage is already extreme by the time clinical symptoms show".

Attacking the plaques wasn't completely worthless - findings show that this often slows disease progression, especially in early cases. There are pre-symptomatic trials ongoing that may clear the air on whether "intervention is late" is the main culprit in treatment underperformance.


What? What to replace the phones with? And why whatever replaces them wouldn't be able to do the same things?

Not really. Anthropic has the "CBRN filter" on Opus series. It used to kill inquiries on anything that's remotely related to biotech. Seems to have gotten less aggressive lately?

I was reverse engineering a medical device back in 2025 and it was hard killing half my sessions.


Why not both? A pre-trained LLM has an awful lot of structure, and during SFT, we're still doing deep learning to teach it further. Innate structure doesn't preclude deep learning at all.

There's an entire line of work that goes "brain is trying to approximate backprop with local rules, poorly", with some interesting findings to back it.

Now, it seems unlikely that the brain has a single neat "loss function" that could account for all of learning behaviors across it. But that doesn't preclude deep learning either. If the brain's "loss" is an interplay of many local and global objectives of varying complexity, it can be still a deep learning system at its core. Still doing a form of gradient descent, with non-backpropagation credit assignment and all. Just not the kind of deep learning system any sane engineer would design.


Modern systems like Nano Banana 2 and ChatGPT Images 2.0 are very close to "just use Photoshop directly" in concept, if not in execution.

They seem to use an agentic LLM with image inputs and outputs to produce, verify, refine and compose visual artifacts. Those operations appear to be learned functions, however, not an external tool like Photoshop.

This allows for "variable depth" in practice. Composition uses previous images, which may have been generated from scratch, or from previous images.


Evolution is an optimization process. So if platonic representation hypothesis holds well enough, there might be some convergence between ML neural networks and evolved circuits and biases in biological neural networks.

I'm partial to the "evolved low k-complexity priors are nature's own pre-training" hypothesis of where the sample efficiency in biological brains comes from.


The "platonic representation hypothesis" crowd can't stop winning.

Potentially useful for things like innate mathematical operation primitives. A major part of what makes it hard to imbue LLMs with better circuits is that we don't know how to connect them to the model internally, in a way that the model can learn to leverage.

Having an "in" on broadly compatible representations might make things like this easier to pull off.


You seem to be going off the title which is plainly incorrect and not what the paper says. The paper demonstrates HOW different models can learn similar representations due to "data, architecture, optimizer, and tokenizer".

"How Different Language Models Learn Similar Number Representations" (actual title) is distinctly different from "Different Language Models Learn Similar Number Representations" - the latter implying some immutable law of the universe.


> latter implying some immutable law of the universe

I think the implications is slightly weaker -- it implies some immutable law of training datasets?


I don't understand your argument

"How X happens" still implies that X happens, just adds additional explanation on top


"How" = it can happen

Without "How" = it will happen


"using periodic features with dominant periods at T=2, 5, 10" seems inconsistent with "platonic representation" and more consistent with "specific patterns noticed in commonly-used human symbolic representations of numbers."

Edit: to be clear I think these patterns are real and meaningful, but only loosely connected to a platonic representation of the number concept.


Is it an actual counterargument?

The "platonic representation" argument is "different models converge on similar representations because they are exposed to the same reality", and "how humans represent things" is a significant part of reality they're exposed to.


You should see my reply to convolvatron below.

I don't think this is a correct formulation of the platonic representation argument:

  different models converge on similar representations because they are exposed to the same reality
because that would be true for any statistical system based on real data. I am sure the platonic representation argument is saying something more interesting than that. I believe they are arguing against people like me, who say that LLMs are entirely surface correlations of human symbolic representation of ideas, and not actually capable of understanding the underlying ideas. In particular humans can speak about things chimpanzees cannot speak about, but that we both understand (chimps understand "2 + 2 = 4" - not the human sentence, but the idea that if you have a pair of pairs on one hand, and a quadruplet on the other, you can uniquely match each item between the collections). Humans and chimps both seem to have some understanding of the underlying "platonic reality," whatever that means.

"Not actually capable of understanding" is worthless unfalsifiable garbage, in my eyes. Philosophy at its absolute worst rather than science.

Trying to drag an operational definition of "actual understanding" out of anyone doing this song and dance might as well be pulling teeth. People were trying to make the case for decades, and there's still no ActualUnderstandingBench to actually measure things with.


No, it is partially falsifiable. LLMs clearly don't understand the concept of quantity. They fail at tests designed to assess number understanding in dogs and pigeons; in fact they are quite likely to fail these tests, because they are wildly out of distribution.

We don't know how to demonstrate actual understanding, but we sure can demonstrate a lack of it. When it comes to abstract concepts like "three" or even "more," LLMs have a clear lack of understanding. Birds and mammals do not.


Which "tests", exactly? Do tell. Tests where LLMs don't beat a human baseline is genuinely hard to come by nowadays.

you're right, its just that 'platonic' is an argument that numbers exist in the universe as objects in and of themselves, completely independent of human reality. if we don't assume this, that numbers are a system that humans created (formalism), then sure, we can be happy that llms are picking common representations that map well into our subjective notions of what numbers are.

FWIW it's objectively false that numbers are a system humans created. That's almost certainly true for symbolic numbers and therefore large numbers ( > 20). But pretty much every bird and mammal is capable of quantitative reasoning; a classic experiment is training a rat to press a lever X times when it hears X tones, or training a pigeon to always pick the pile with fewer rocks even if the rocks are much larger (i.e. ruling out the possibility of simpler geometric heuristics). Even bees seem to understand counting: an experiment set up 5 identical human-created (clearly artificial) landmarks pointing to a big vat of yummy sugar water. When the experimenters moved the landmarks closer together, the bees undershot the vat, and likewise overshot when the landmarks were moved further apart.

And of course similar findings have been reproduced etc etc. The important thing to note is how strange and artificial these experiments must seem for the animals involved - maybe not the bees - so e.g. it seems unlikely that a rat evolved to push a lever X times, it is much more plausible that in some sense the rat figured it out. At least in birds and mammals there seems to be a very specific center of the brain responsible for coordinating quantitative sensory information with quantitative motor output, handling the 1-1 mapping fundamental to counting. More broadly, it seems quite plausible that animals which have to raise an indeterminate number of live young would need a robust sense of small-number quantitative reasoning.

It is an interesting question as to whether this is some cognitive trick that evolved 200m years ago and humans are just utterly beholden to it. But I think it requires jumping through less hoops to conclude that the human theory of numbers is pointing to a real law of the universe. It's a consequence of conservation of mass/energy: if you have 5 apples and 5 oranges, you can match each apple to a unique orange and vice versa. If you're not able to do that, someone destroyed an apple or added an orange, etc. It is this naive intuitive sense of numbers that we think of as the "platonic concept" and we share it with animals. It seems to be inconsistent and flaky in SOTA reasoning LLMs. I don't think it's true that LLMs have stumbled into a meaningful platonic representation of numbers. Like an artificial neural network, they've just found a bunch of suggestive and interesting correlations. This research shows the correlations are real! But let's not overinflate them.


Regardless of whether the convergence is superficial or not, I am interested especially in what this could mean for future compression of weights. Quantization of models is currently very dumb (per my limited understanding). Could exploitable patterns make it smarter?

That's more of a "quantization-aware training" thing, really.

Same with images maybe?

Saw similar study comparing brain scans of person looking at image, to neural network capturing an image. And were very 'similar'. Similar enough to make you go 'hmmmm, those look a lot a like, could a Neural Net have a subjective experience?'


"Subjective experience" is "subjective" enough to be basically a useless term for any practical purpose. Can't measure it really, so we're stuck doing philosophy rather than science. And that's an awful place to be in.

That particular landmine aside, there are some works showing that neural networks and human brain might converge to vaguely compatible representations. Visual cortex is a common culprit, partially explained by ANN heritage perhaps - a lot of early ANN work was trying to emulate what was gleaned from the visual cortex. But it doesn't stop there. CNNs with their strong locality bias are cortex-alike, but pure ViTs also converge to similar representations to CNNs. There are also similarities found between audio transformers and auditory cortex, and a lot more findings like it.

We don't know how deep the representational similarity between ANNs and BNNs runs, but we see glimpses of it every once in a while. The overlap is certainly not zero.

Platonic representation hypothesis might go very far, in practice.


As someone actively researching in the neuroscience field these ideas are increasingly questionable. They do do a decent job of job of predicting neural data depending on your definition and if you compare them to hand built sets of features but we’re actually not even sure that will stay true. Especially in vision we already know that as models have scaled up they actually diverge more from humans and use quite different strategies. If you want them to act like humans or better reflect neural data you have to actively shape the training process to make that happen. There’s less we know about the language side of things currently though as that part of the field hasn’t yet really figured out exactly what they’re looking at yet because we generally know less about language in the brain vs vision. I think most vision scientists are on board with the idea that these things have really been diverging and have to be coerced to be useful. Language it’s more up in the air but there’s a growing wave of papers lately that seem to call the human LLM alignment idea into question. Personally I think the platonic representation idea is just a function of the convergence of training methods, data, and architectures all of these different labs are using. If you look at biological brains across species and even individuals within a species you see an incredible variety of strategies and representations that it seems ridiculous to me that anyone would suggest that there’s some base way to represent reality that is shared across everyone and every species. Here’s some articles that may be of interest if you’re curious:

[1] https://arxiv.org/pdf/2211.04533 [2] https://www.nature.com/articles/s41586-025-09631-6 [3] https://www.biorxiv.org/content/10.1101/2025.03.09.642245v1


Guess I was extrapolating from words and images.

If you could brain scan a human, and identify a shape of the network that corresponds to an emotion, and then could identify that in the ANN, could we say the ANN is experiencing an emotion.

I think its loosely referred to as "neural correlate".

I'm assuming what you are talking about with Convergence, would be these "neural correlate". And no reason we couldn't move beyond images to 'feelings'.


Not giving the data to researchers means not getting the scientific benefits from that data. Which was the point of collecting that data in the first place.

Reckless harm prevention is the root of many evils.


As a biostatistician who's touched epidemiological studies, I'd argue losing the trust of participants and the public is one of the biggest threats to the viability of the whole research enterprise. It's reckless to jeopardize that as well. Conversely, this dataset will be mined for at least 30-50 years - there are an infinite number of questions that can be asked of this dat. Given that timescale, I think a little delay here is acceptable.

It's not a zero-sum game, you can both protect people and reap the benefits of health data. Many countries have much safer approaches. UK Biobank typically leads with the scale of the data, but not with its infrastructure.

That’s a false dichotomy.

Sensitive research systems thread that needle by giving remote access to researchers with the data in the control and supervision of the responsible organization. Strong internal data access controls and data siloing alongside strict verified extraction routines. Specifically: limited project-dedicated DB access, full logging of data interactions, and full lockouts/freezes if something feels off.

‘The five safes’ is a good presentation from the NHS(?) a decade ago covering the approaches.

Data publishing restrictions around health data aren’t reckless. Modern computing and digital permanence mean we have to be extra cautious.


No, this is a real tradeoff.

Any friction you add to "access the data" process makes it harder for legitimate researchers to get access to, and get benefits from that data.

So, at what point do stricter data controls begin to choke you at the throat?


We have dozens of data / db startups - kinda odd that there isnt one (I have seen) that focuses on this problem.

Perhaps our future ai overlords will feel its important to compartmentalise, and log data access more agressively.


I do expect this to have a "novelty edge" over human opponents - which can be closed with practice, on the human end.

And, like many AIs, it can have "jagged capability" gaps, with inhuman failure modes living in them - which humans can learn to exploit, but the robot wouldn't adapt to their exploitation because it doesn't learn continuously. Happened with various types of ML AIs designed to fight humans.


Only if you assume the AI can't improve. Otherwise, AI has a fundamental edge over humans in that they don't get old and die, and can be copied perfectly without an expensive retraining period

Oh, they can. They just need a human touch to actually improve.

For now. It's a work in progress.


Chess players learned to exploit chess computers’ weaknesses in the beginning too, but they can’t any longer. This version of the robot might not learn continuously, but the next will be better.

I believe there are still some echoes of the concept. Even top engines will play certain grandmaster draw lines unless told more or less explicitly not to. So if you were playing a match against Stockfish you'd want to play the Berlin draw as White every time, for example.

But chess is a turn-based game where there's no deception (in the sense that both players can see all legal moves for both themselves and their opposition at all times), whereas in table tennis, it's in real time, it's fast as hell, the table is small, and the ball can have 2 or 3 different spin types from the same arm/hand/wrist movement , and can land in a number of different spots.

Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: