It's crazy to me that this is not considered fraud. You sign up for a yearly plan under a given assumption of functionality, then they just change the terms to give you less than what they agreed to without compensating you in any way. That's textbook fraud.
I wonder whether this was your first attempt to solve this issue with LLMs, and this was the time you finally felt they were good enough for the job. Did you try doing this switch earlier on, for example last year when Claude Code was released?
Honestly, I was very adverse to agentic code up until Opus came out. The hallucinations and false confidence it had in objectively wrong answers just broke more things than it fixed.
However after it came out it suddenly behaved closely to what they marketed it as being. So it was my first real end-to-end project relying on AI at the front seat. Though design wise it is nowhere near perfect, I was holding it's hand the entire way throughout.
Fascinating stuff. Any chance of using a sparse autoencoder or some other method to try to grasp what the model is actually doing in those middle layers? It would be quite cool to get a better sense of what type of input it is getting in the first time it goes through the reasoning circuit compared to the second or third time.
How do you know that? We don't have access to the logs to know anything about its training, and it's impossible for it to have trained on every potential position in Go.
As another commenter mentioned, the point of the story from Borges is that a perfectly detailed map is rather useless, because you need abstraction (it's a repeating theme in some other stories from him like the Library of Babel, and Funes the Memorious). LLMs are likely already able to exhaust the conceptual space for any given field, but some judgement is still going to be required about what to pursue. In biology and other fields this problem is even bigger because experimentation is so difficult and expensive.
The process of judgement and resource allocation will still be human for quite a while, but it's quite likely some humans will outsource their responsibility to AI to cut corners.
The main thrust of the article is that codebases can grow too large to be manageable by LLMs.
> It simply will not fit the context window, and README files are of limited use.
I think many useful applications can be built without reaching current context window limits, which will certainly grow. Besides, there are many tricks that Claude Code and Codex use for getting around this problem, such as compacting and sharding a task across many agents.
reply