That's true, and I guess the reason why we're building so many datacenters is to answer the question how far exactly will blind speed take us, assuming that we fail to make substantial improvements to AI architecture.
What's the point? You can make good or bad software, with or without LLMs. Do you ask a carpenter if they use a hammer or nail gun? Did they only use the nail gun for the roof and the deck?
If you care that much and don't have a foundation of trust, you need to either verify the construction is good, or build it yourself. Anything else is just wishful thinking.
Reporting back. This appears to be a bug in my original test the code of which sadly I did not commit anywhere. I went back to regenerate these tests and proved the opposite - the C API is better than PRAGMA and works across connections. I am going to make that update as I've proved across dozens of versions of SQLite that this is not in fact the case.
No real person wants that. A bunch of hackers want it, so that they can try it a couple of times as a fun side project and then never use it again in their life.
First of all, this is more than just note taking. It appears to be a (yet another) harness for coordinating work between agents with minimal human intervention. And as such, shouldn’t part of the point be to not have to build that mental model yourself, but rather offload it to the shared LLM “brain”?
Highly debatable whether it’s possible to create anything truly valuable (valuable for the owner of the product that is) with this approach, though. I’m not convinced that it will ever be possible to create valuable products from just a prompt and an agent harness. At that point, the product itself can be (re)created by anyone, product development has been commodified, and the only thing of value is tokens.
My hypothesis is that “do things that don’t scale”[0] will still apply well into the future, but the “things that don’t scale” will change.
All that said, I’ve finally started using Obsidian after setting up some skills for note taking, researching, linking, splitting, and restructuring the knowledge base. I’ve never been able to spend time on keeping it structured, but I now have a digital secretary that can do all of the work I’m too lazy to do. I can just jot down random thoughts and ideas, and the agent helps me structure it, ask follow-up questions, relate it to other ongoing work, and so on. I’m still putting in the work of reading sources and building a mental model, but I’m also getting high-quality notes almost for free.
/*
* RealTek 8129/8139 PCI NIC driver
*
* Supports several extremely cheap PCI 10/100 adapters based on
* the RealTek chipset. Datasheets can be obtained from
* www.realtek.com.tw.
*
* Written by Bill Paul <wpaul@ctr.columbia.edu>
* Electrical Engineering Department
* Columbia University, New York City
/
/
* The RealTek 8139 PCI NIC redefines the meaning of 'low end.' This is
* probably the worst PCI ethernet controller ever made, with the possible
* exception of the FEAST chip made by SMC. The 8139 supports bus-master
* DMA, but it has a terrible interface that nullifies any performance
* gains that bus-master DMA usually offers.
*
* For transmission, the chip offers a series of four TX descriptor
* registers. Each transmit frame must be in a contiguous buffer, aligned
* on a longword (32-bit) boundary. This means we almost always have to
* do mbuf copies in order to transmit a frame, except in the unlikely
* case where a) the packet fits into a single mbuf, and b) the packet
* is 32-bit aligned within the mbuf's data area. The presence of only
* four descriptor registers means that we can never have more than four
* packets queued for transmission at any one time.
*
* Reception is not much better. The driver has to allocate a single large
* buffer area (up to 64K in size) into which the chip will DMA received
* frames. Because we don't know where within this region received packets
* will begin or end, we have no choice but to copy data from the buffer
* area into mbufs in order to pass the packets up to the higher protocol
* levels.
*
* It's impossible given this rotten design to really achieve decent
* performance at 100Mbps, unless you happen to have a 400Mhz PII or
* some equally overmuscled CPU to drive it.
*
* On the bright side, the 8139 does have a built-in PHY, although
* rather than using an MDIO serial interface like most other NICs, the
* PHY registers are directly accessible through the 8139's register
* space. The 8139 supports autonegotiation, as well as a 64-bit multicast
* filter.
*
* The 8129 chip is an older version of the 8139 that uses an external PHY
* chip. The 8129 has a serial MDIO interface for accessing the MII where
* the 8139 lets you directly access the on-board PHY registers. We need
* to select which interface to use depending on the chip type.
*/
As someone who has been using Cyrillic writing all my life, I've never noticed this bloat you're speaking of, honestly...
Maybe if you're one of those AI behemots who works with exabytes of training data, it would make some sense to compress it down by less than 50% (since we're using lots of Latin terms and acronyms and punctuation marks which all fit in one byte in UTF-8).
On the web and in other kinds of daily text processing, one poorly compressed image or one JavaScript-heavy webshite obliterates all "savings" you would have had in that week by encoding text in something more efficient.
It's the same with databases. I've never seen anyone pick anything other than UTF-8 in the last 10 years at least, even though 99% of what we store there is in Cyrillic. I sometimes run into old databases, which are usually Oracle, that were set up in the 90s and never really upgraded. The data is in some weird encoding that you haven't heard of for decades, and it's always a pain to integrate with them.
I remember the days of codepages. Seeing broken text was the norm. Technically advanced users would quickly learn to guess the correct text encoding by the shapes of glyphs we would see when opening a file. Do not want.
The space of self building artefacts is interesting and is booming now because recent LLM versions are becoming good at it fast (in particular if they are of the "coding" kind).
I've also experimented recently with such a project [0] with minimal dependencies and with some emphasis on staying local and in control of the agent.
It's building and organising its own sqlite database to fulfil a long running task given in a prompt while having access to a local wikipedia copy for source data.
A very minimal set of harness and tools to experiment with agent drift.
Adding image processing tool in this framework is also easy (by encoding them as base64 (details can be vibecoded by local LLMs) and passing them to llama.cpp ).
It's a useful versatile tool to have.
For example, I used to have some scripts which processed invoices and receipts in some folders, extracting amount date and vendor from them using amazon textract, then I have a ui to manually check the numbers and put the result in some csv for the accountant every year. Now I can replace the amazon textract requests by a llama.cpp model call with the appropriate prompt while still my existing invoices tools, but now with a prompt I can do a lot more creative accounting.
I have also experimented with some vibecoded variation of this code to drive a physical robot from a sequence of camera images and while it does move and reach the target in the simple cases (even though the LLM I use was never explicitly train to drive a robot), it is too slow (10s to choose the next action) for practical use. (The current no deep-learning controller I use for this robot does the vision processing loop at 20hz).
But that's also basically true for humans. It's harder to "prove" humans are random, but wouldn't you think a person would do things slightly differently when given the same tasks but on different days? People change their minds a lot, it's just that there's no "reconsider" button for people so you feel a bit of social friction if you pester somebody to rethink an issue. But it's no different.
I'd be really surprised if your point is that humans, unlike AI, are super deterministic and that's why they are so much more trustworthy and smarter than AI...
This seems like wild speculation that isn't even really true at all. OpenAI locked up all the compute capacity which is why Anthropic is struggling so badly with capacity to scale for demand. It's why Claude quality is plummeting and people are leaving in droves because the usage limits are pathetic and the API pricing structure is outrageous. All because they can't scale. So that's what this deal is about.
Are you referring to the one (1) study that showed that when cheaper LLM's auto-generated an AGENTS.md, it performed more poorly than human editted AGENTS.md? https://arxiv.org/abs/2602.11988
I'd love to see other sources that seek to academically understand how LLM's use context, specifically ones using modern frontier models.
My takeaway from these CLAUDE.md/AGENTS.md efforts isn't that agents can't maintain any form of context at all, rather, that bloated CLAUDE.md files filled with data that agents can gather on the spot very quickly are counter-productive.
For information which cannot be gathered on the spot quickly, clearly (to me) context helps improve quality, and in my experience, having AI summarize some key information in a thread and write to a file, and organize that, has been helpful and useful.
This money could be invested in universal healthcare, or into AI research for medicine. But hey, I guess replacing developers and generating slop is more beneficial to our society.
That’s because you’ve been there a decade. It’s very common for people to skip jobs every 2 years so that they never end up seeing the long term consequences of their actions.
The other common pattern I’ve seen goes something like this.
Product asks Tactical Tornado if they can building something TT says sure it will take 6 weeks. TT doesn’t push back or asks questions, he builds exactly what product asks for in an enormous feature branch.
At the end of 6 weeks he tries to merge it and he gets pushback from one or more of the maintainability people.
Then he tells management that he’s being blocked. The feature is already done and it works. Also the concerns other engineers have can’t be addressed because “those are product requirements”. He’ll revisit it later to improve on it. He never does because he’s onto the next feature.
Here’s the thing. A good engineer would have worked with product to tweak the feature up front so that it’s maintainable, performant etc…
This guy uses product requirements (many that aren’t actually requirements) and deadlines to shove his slop through.
At some companies management will catch on and he’ll get pushed out. At other companies he’ll be praised as a high performer for years.
> I doubt many people will honestly admit they did no design, testing and that they believe the code is sub par
I'd doubt any engineer that doesn't call most of their own code subpar after a week or two after looking back. "Hacking" also famously involves little design or (automated) testing too, so sharing something like that doesn't mean much, unless you're trying to launch a business, but I see no evidence of that for this project.
> motherload of all heavy industry vendor financing
I doubt they are bigger than other national "heavy industry" champions from East Asia and Western/Central Europe. Without checking, I would guess that the global leaders are Boeing and Airbus.
>There is a federal law that prohibits people from communicating with dolphins.
>It’s called the Marine Mammal Protection Act. Signed in 1972 by President Richard Nixon, the federal law was created to protect marine mammals from being hunted, harassed, captured or killed.
>In a sense, talking to or communicating with dolphins could qualify as harassment under the Marine Mammal Protection Act.
>There are two levels of harassment, according to the National Oceanic and Atmospheric Administration. Harassment at one level is considered “any act of pursuit, torment, or annoyance that has the potential to injure a marine mammal or marine mammal stock in the wild.”
>On another level, harassment is defined by the NOAA as “acts having the potential to disturb (but not injure) a marine mammal or marine mammal stock in the wild by disrupting behavioral patterns, including, but not limited to, migration, breathing, nursing, breeding, feeding, or sheltering.”
Yeah, as someone who had to implement a protocol stack to talk to a X.400 server, it was not fun at all. Weird encodings, monster spec, all sorts of weird server-specific stuff that you had to do exactly right if you wanted the server to accept your email.
Compared to that, when I implemented RFC821/822 (i.e. SMTP) mail, the hardest part was the weird line-encodings, but other than that, the spec was ___so___ nicely readable and pragmatic.