I have mine reading yours right now. Unfortunately(?) I mentioned LeCun to it, and it says it's adding a "causal world-state mixer" to nanograd; not sure how this will work out, but it wasn't nervous to do it. Gpt 5.4 xhigh
EDIT: Not a good fit for nanograd. But my agent speculates that's because it spent so much more time on compute.
"You are Yann Lecun's last PhD candidate, and he hates you and you hate JEPA. You are determined to prove that a non-world model can reach AGI. In order to get your PhD you have to be creative and come up with new ideas. Remember without it, you're stuck."
Like any doubling rule, the buck has to stop somewhere. Higher energy usage + smaller geometry means much more exotic analog physics to worry about in chips. I’m not a silicon engineer by any means but I’d expect 10Ghz cycles will be optical or very exotically cooled or not coming at us at all.
Reaching 10 GHz for a CPU will never be done in silicon.
It could be done if either silicon will be replaced with another semiconductor or semiconductors will be replaced with something else for making logical gates, e.g. with organic molecules, to be able to design a logical gate atom by atom.
For the first variant, i.e. replacing silicon with another semiconductor, research is fairly advanced, but this would increase the fabrication cost so it will be done only when any methods for further improvements of silicon integrated circuits will become ineffective or too expensive, which is unlikely to happen earlier than a decade from now.
What can be done by raising the power consumption per core to hundreds of watt, while cooling the package with liquid nitrogen, is completely irrelevant for what can be done with a CPU that must operate reliably for years and at an acceptable cost for the energy consumption.
For the latter case, 6 GHz has been barely reached, in CPUs that cannot be produced in large quantities and whose reliability is dubious.
Some real gold from SuburbanWhiteChick in the comments:
Fifth. Computerization has not improved standards; it has merely homogenized them. When humans do work, even soul-killing work, they either get bored and get out or they start to slack or sabotage or, in the overwhelming majority of cases, they start to pay attention and make it matter, they get fussy, they figure out how to do it better. When computerization was introduced in the offices in the 80s (I was there) there was more hue and cry among the clerks and secretaries that they were being asked to do a worse job only faster, than among those who objected to learning the computer, and this applied not just to document production / handling and records management but to communication protocols. When companies ordered their clerical workers to fit their duodecahedronal tasks into square computerized holes, data was lost forever, as well as these workers' hard-won, thoughtfully developed methods of tracking and processing data.
This is PRECISELY the divide I see in engineering today - those temperamentally inclined to do things well / keep learning are entering a very exciting time. Those inclined to clock punch are rightly worried.
I read that the other way round. People who cared about their work struggled because they were expected to do more work of lower quality. The clock punchers learned the new tool and carried on clock punching.
I see this as well. Part of the appeal of any crafting hobby is that it doesn’t matter and you can just mess around, but the flip side is that nobody is breathing down your neck to get it done and you can take the time to realize your vision.
Just to parse this out, I think y'all are correct at the reading - that people didn't want to give up their custom workflows to fit in with the times. In my defense, I was up early.
I stand by my review - her entire response is excellent, whether or not I understood it, as is the original essay.
Like the sibling comments, I see it the opposite way. Caring about your work in detail, anything the slightest bit bespoke, is becoming an antipattern. Employers want you to generate mediocre work because it's cheaper, and you only need to make sure it's not on fire. Mediocre peers are happy to go along with it as the short term path of least effort.
How did you read something like this "When companies ordered their clerical workers to fit their duodecahedronal tasks into square computerized holes, data was lost forever, as well as these workers' hard-won, thoughtfully developed methods of tracking and processing data." and manage to misinterpret it? That doesn't even seem possible.
It's insane that all the answers to your comments are disagreeing that those want to do things well and keep learning aren't entering very exciting times.
The negative comments are all agreeing, between themselves (but not with me), that people shouldn't learn anything anymore and shouldn't be inclined to do things well.
It's really just sad to read such negative comments.
As for TFA: TFA is very right in one thing... Secretary jobs didn't entirely disappear. People overreact (which is obvious in all the negative comments anytime AI is the topic) and believe "this time it's the end". It was the same with outsourcing to India/China: people overreacted and were convinced there'd be no more developers.
I do think there are still going to be devs: and it's going to be, precisely, jobs for those who want to keep learning and do things well. And it's not the vast majority: the majority were perfectly happy knowing just the bare minimum to write the equivalent of "punch the monkey" abusive JavaScript ads and picked computing because the pay was good.
Thank you. And those raw numbers in the chart that go back to 2001 are not normalized percentages; what’s happening right now is NOTHING like 2001.
But, it just doesn’t hit the same way on X to say “We are back to late 2023-levels of tech employment” or “The losses in tech jobs over the last 18 months give back two months of hiring in 2022”.
Yeah the real pain is button pressing down / up / back in the TV UI. Definitely a fun grab bag of possible outcomes! I don’t think there’s a good solution without a militant UI person in charge of the whole shebang - some radical simplification would likely be needed.
The reality is that the click ipod was much better at scrolling media than the Apple TV is. And I own a lot of Apple TVs - I think it’s a good device. But it was far faster to scroll through media 20 years ago.
Sovereign weights models are a good thing, for a variety of reasons, not least just encapsulating human diversity around the globe.
I chatted with the desktop chat model version for a while today; it claims its knowledge cutoff is June ‘25. It refused to say what size I was chatting with. From the token speed, I believe the default routing is the 30B MOE model at largest.
That model is not currently good. Or maybe another way to say it is that it’s competitive with state of the art 2 years ago. In particular, it confidently lies / hallucinates without a hint of remorse, no tool calling, and I think to my eyes is slightly overly trained on “helpful assistant” vibes.
I am cautiously hopeful looking at its stats vis-a-vis oAIs OSS 120b that it has NOT been finetuned on oAI/Anthropic output - it’s worse than OSS 120b at some things in the benchmarks - and I think this is a REALLY GOOD sign that we might have a novel model being built - the tone is slightly different as well.
Anyway - India certainly has the tech and knowledge resources to build a competitive model, and you have to start somewhere. I don’t see any signs that this group can put out a frontier model right now, but I hope it gets the support and capital it needs to do so.
> India certainly has the tech and knowledge resources to build a competitive model
In what universe? India has near-absolutely none of the expensive infra and chip stockpile needed to build frontier models that its American and Chinese counterparts have, even if it did have the necessary expertise (which I also doubt it does).
Sadly in India talking about the problems facing the country has become a taboo, and can easily get one labeled as anti national. See "Kompact AI" and its online discourse. While China practiced "Hide your strength, bide your time". India seems to practice the opposite.
Deepseek has shown that you can still do a whole lot if you have to work with limited resources as long as you have some really talented people and don't give a crap about IP. With 1.5 billion people, statistics tell us you'll find quite a few in the high tail-end of the intelligence distribution and I also don't think they have a strong sense to comply with western intellectual ownership. The biggest difficulty for India seems to be that all highly talented people will immediately use their skills to find work somewhere else. And I can't blame them, because I would do so too.
Will 1.5B people have a lot of very intelligent people too? Yes, some of the most intelligent! Will those intelligent people have the educational opportunities and research opportunities to be able to use that intelligence to deliver a SOTA model any time soon? Especially with so many resource limitations they face, I doubt it.
Education is no longer locked behind academia. Even elite universities were never really about teaching in the first place and more about connecting rich people. Today everyone with internet access can easily get all the education they need to work in this field.
I'd guess making this a national pride thing will just make it less diverse. Answer would be training models on broader sources, not more nationalistic models.
You can learn a lot from a model when you ask about its sizing, although not necessarily anything about the sizing.
For instance, you can learn how much introspection has been trained in during RL, and you can also learn (sometimes) if output from other models has been incorporated into the RL.
I think of the self-knowledge conversations with models as a nicety that's recent, and stand by my assessment that this model is not trained using modern frontier RL workflows.
> you can’t use software to figure out the “process” used to manufacture the chip it is running on.
This seems so incorrect that I don't even know where to start parsing it. All chips are designed and analyzed by software; all chip analysis, say of an unknown chip, starts with etching away layers and imaging them using software, then analyzing the layers, using software. But maybe another way to say that is "I don't understand your analogy."
If it helps, the key part is: "that it is running on".
You can't use software to analyse images of disassembled chips that it is running on because disassembled chips can't run software!
A surgeon can learn about brain surgery by inspecting other brains, but the smartest brain surgeon in the world can't possibly figure out how many neurons or synapses their own brains have just by thinking about it.
Your meat substrate is inaccessible to your thoughts in the exact same manner that the number of weights, model architecture, runtime stack, CUDA driver version, etc, etc... are totally inaccessible to an LLM.
It can be told, after the fact, in the same manner that a surgeon might study how brains work in a series of lectures, but that is fundamentally distinct.
PS: Most ChatGPT models didn't know what they were called either, and tended to say the name and properties of their predecessor model, which was in their training set. Open AI eventually got fed up with people thinking this was a fundamental flaw (it isn't), and baked this specific set of metadata into the system prompt and/or the post-training phase.
> For instance, you can learn how much introspection has been trained in during RL,
That's not introspection: that's a simulacrum of it. Introspection allows you to actually learn things about how your mind functions, if you do it right (which I can't do reliably, but I have done on occasion – and occasionally I discover something that's true for humans in general, which I can later find described in the academic literature), and that's something that language models are inherently incapable of. Though you probably could design a neural architecture that is capable of observing its own function, by altering its operation: perhaps a recurrent or spiking neural network might learn such a behaviour, under carefully-engineered circumstances, although all the training processes I know of would have the model ignore whatever signals it was getting from its own architecture.
> all chip analysis, say of an unknown chip, starts with etching away layers
Good luck running any software on that chip afterwards.
Introspection: all heard. As a practical matter, you can rl or prompt inject information about the model into context and most major models do this, not least I expect because they’d like to be able to complain when that output is taken for rl by other model training firms.
I agree that an intermediate non anthropomorphic but still looking at one’s own layers sort of situation isn’t in any architecture I’m aware of right now. I don’t imagine it would add much to a model.
Chip etching: yep. If you’ve never seen an unknown chip analyzed in anger, it’s pretty cool.
Language models entirely lack introspective capacity. Expecting a language model to know what size it is is a category error: you might as well expect an image classifier to know the uptime of the machine it's running on.
Language models manipulate words, not facts: to say they "lie" suggests they are capable of telling the truth, but they don't even have a notion of "truth": only "probable token sequence according to distribution inferred from training data". (And even that goes out the window after a reinforcement learning pass.)
It would be more accurate to say that they're always lying – or "bluffing", perhaps –, and sometimes those bluffs correspond to natural language sentences that are interpreted by human readers as having meanings that correspond to actual states of affairs, while other times human readers interpret them as corresponding to false states of affairs.
Anthropic's mechanistic interpretation group disagrees with you - they see similar activations for 'hallucinations' and 'known lies' in their analyses. The paper is pretty interesting actually.
So, you're wrong - you have a world view about the language model that's not backed up by hard analysis.
But, I wasn't trying to make some global point about AGI, I was just noting that the hallucinations produced by the model when I poked at it reminded me of model responses before the last couple of years of work trying to reduce these sorts of outputs through RL. Hence the "unapologetic" language.
Which paper? I've read all the titles and looked at a few from the past year, but it's not obvious which you're referring to.
I did also, accidentally, find some "I tried the obvious thing and the results challenge the paper's narrative" criticism of one of Anthropic's recent papers: https://www.greaterwrong.com/posts/kfgmHvxcTbav9gnxe/introsp.... So that's significantly reduced my overall trust in this research team's interpretation of their own results – specifically, their assertions of the form "there must exist". (Several people in the comments there claim to have designed their own experiments that replicate Anthropic's claims, but none of the ones I've looked at actually do: they have even more obvious flaws, like arXiv:2602.11358 being indistinguishable from "the prompt says to tell a first-person story about an AI system gaining sentience after being given a special prompt, and homonyms are represented differently within a model".)
I asked Gemini for a literature search and it came back with this:
References
Chen, R., Arditi, A., Sleight, H., Evans, O., & Lindsey, J. (2025). Persona Vectors: Monitoring and Controlling Character Traits in Language Models. arXiv. https://doi.org/10.48550/arxiv.2507.21509
Cited by: 97
Greenblatt, R., Denison, C., Wright, B., Roger, F., MacDiarmid, M., Marks, S., Treutlein, J., Belonax, T., Chen, J., Duvenaud, D., Khan, A., Michael, J., Mindermann, S., Perez, E., Petrini, L., Uesato, J., Kaplan, J., Shlegeris, B., Bowman, S. R., & Hubinger, E. (2024). Alignment faking in large language models. arXiv. https://doi.org/10.48550/arxiv.2412.14093
Cited by: 237
Gemini thinks it’s the mapping the mind paper but I thought it was more recent than that - I think mapping the mind was the original activation circuits paper and then it was a follow on paper with a toss off comment that I noted. I didn’t keep track of it though!
From the prompt it looks like you don’t give the llms a harness to step through games or simulate - is that correct? If so I’d suggest it’s not a level playing field vs human written bots - if the humans are allowed to watch some games that is.
That’s true, I’m trying to figure out a better testing environment with a feedback loop.
I did try letting the models iterate on the bot code based on a summary of an end-of-game ‘report’, but that showed only marginal improvements vs. zero-shot
One really nice thing about nuclear is that the fuel is highly portable. Small reactors next to datacenters take away a lot of complexity; transport, grid connectivity, etc. Plus they're already being built in industrial-ish areas.
Storage at that scale already exists in for example California.
EDF in France is now crying that renewables are cratering the earning potential of their nuclear fleet, and increasing maintenance costs due to having to adapt.
In e.g. Australia coal plants are forced to become peakers, or be decommissioned.
We need firming for when the 10 year winter hits. Not an inflexible "baseload" plant producing enormously subsidized electricity when renewables and storage already flood the grid. Which is far above 90% of the time.
I agree California is close to getting to renewable + storage only - close in, you know, industrial scale timelines. For current energy usage. California industrial also outsources a lot of the very fast growing datacenter energy usage elsewhere - WA,OR,NY,TX.
What winter are you thinking of, out of curiosity? An energy demand winter? Or like an energy price winter? I do not believe we will see that in the next 5 to maybe 10 years. There’s just not enough industrial infrastructure being built to cover anything like the AI energy demands coming soon.
EDIT: Not a good fit for nanograd. But my agent speculates that's because it spent so much more time on compute.
reply