More

piinbinary · 2026-04-02T13:15:43 1775135743

I'd be very curious to know what class of vulnerability these tend to be (buffer overrun, use after free, misset execute permissions?), and if, armed with that knowledge, a deterministic tool could reliably find or prevent all such vulnerabilities. Can linters find these? Perhaps fuzzing? If code was written in a more modern language, is it sill likely that these bugs would have happened?

bri3d · 2026-04-02T16:28:50 1775147330

> Can linters find these? Perhaps fuzzing?

That's what syzbot / syzkaller does, as mentioned in the article, with somewhat similar results to the AI-fuzzing that they've been experiencing recently.

The issue that Linux maintainers have in general is that there are so many of these "strict correctness and safety" bugs in the Linux codebase that they can't fix them all at once, and they have no good mechanism to triage "which of these bugs is accessible to create an exploit."

This is also the argument by which most of their bugs become CVEs; in lieu of the capability to determine whether a correctness bug is reachable by an attacker, any bug could be an exploit, and their stance is that it's too much work to decide which is which.

tptacek · 2026-04-02T17:25:20 1775150720

It's a bigger deal than that.

Academically, syzkaller is just a very well orchestrated fuzzer, producing random pathological inputs to system calls, detecting crashes, and then producing reproductions. Syzkaller doesn't "know" what it's found, and a substantial fraction of what it finds are "just" crashers that won't ever be weaponizable.

An LLM agent finding vulnerabilities is an implicit search process over a corpus of inferred vulnerability patterns and inferred program structure. It's stochastic static program analysis (until you have the agent start testing). It's generating (and potentially verifying) hypotheses about actual vulnerabilities in the code.

That distinction is mostly academic. The bigger deal is: syzkaller crashes are part of the corpora of inputs agents will use to verify hypotheses about how to exploit Linux. It's an open secret that there are significant vulnerabilities encoded in the (mostly public!) corpus of syzbot crash reproductions; nobody has time to fish them out. But agents do, and have the added advantage of being able to quickly place a crash reproduction in the inferred context of kernel internals.

bri3d · 2026-04-02T17:46:29 1775151989

Yes, once we reach the broader conversation (I actually didn't initially grasp that the OP post was a sub-article under another one on LWN which then linked out to yet another article called "Vulnerability Research is Cooked"), I completely agree.

Modern LLMs are _exceptionally_ good at developing X-marks-the-spot vulnerabilities into working software; I fed an old RSA validation mistake in an ECU to someone in a GitHub comment the other day and they had Claude build them a working firmware reflashing tool within a matter of hours.

I think that the market for "using LLMs to triage bug-report inputs by asking it to produce working PoCs" is incredibly under-leveraged so far and if I were more entrepreneurial-minded at this junction I would even consider a company in this space. I'm a little surprised that both this article and most of the discussion under it hasn't gone that direction yet.

tptacek · 2026-04-02T17:54:38 1775152478

(I wrote the "Cooked" article, I'm not entirely sure why people are commenting on it on LWN.)

Chu4eeno · 2026-04-07T18:19:26 1775585966

according to anthropic's red team not even the secret claude stuff they're holding back is able to weaponize vulnerabilities without simplifying (disabling mitigations etc).

so we might be lucky that the LLMs are able to find the vulnerabilities before they are able to weaponize them, giving defense a time window.

piinbinary · 2026-03-19T19:51:15 1773949875

How does the benchmark tell how many branches were mispredicted? Is that something the processor exposes?

ErikCorry · 2026-03-20T00:35:02 1773966902

Yeah performance counters

cloudbonsai · 2026-03-20T07:00:34 1773990034

To fill in the details, here is the code used for the measurement:

https://github.com/lemire/counters/blob/main/include/counter...

It fetches the number of mispredicted instructions from Linux's perf subsystem, which in turn gathers the metrics from CPU's PMU (Performance Monitoring Unit) interface.

piinbinary · 2026-03-19T01:37:33 1773884253

> a luxury apartment building goes up, surveys the market, and sets its rents 30% higher for the privilege of living in a new building with a gym for dogs or ball pit or whatever. Then the older buildings say, "Well, we can raise our rents 20% and still be the best deal in town," and so on.

I think that might not be the right cause and effect relationship. The actual cause is increased demand. This creates both the increased pricing of existing stock and an incentive to build new stock.

piinbinary · 2026-02-25T17:11:54 1772039514

Back when I lived in SF, there was one bus route (the 6, I believe) that I could use to get to work. The bus was so slow due to frequent, long stops and traffic lights that I could keep up with it on foot by walking briskly. I only bothered taking it when it was raining because it didn't get me to work any faster than walking.

piinbinary · 2026-02-25T17:07:47 1772039267

That's an interesting point.

I'm also curious how bus stops interact with timed lights. Presumably each time the bus stops, it gets kicked back to the next cycle of green lights (which might be a low-single-digit minute delay).

Hopefully there's a traffic engineer in the audience who can give the real answers.

johannes1234321 · 2026-02-25T17:23:29 1772040209

The way it is done her ein my European city is that the bus stop is move behind the traffic lights. The bus and the system are in radio contact, thus the position is known. The time the bus needs from current location to the traffic lights on green light can be predicted, thus the system can calculate whether to keep the green light till the bus arrives or turn red, let the crossing traffic go and then turn green for the bus again. The less predictable time of passenger getting off and on (takes time when crowded, wheelchair takes time, but can be fast when nobody requires that stop) is behind the traffic lights, thus doesn't have to go into the calculation.

Of course this has limits on density of traffic lights and traffic isn't fully predictable either, but overall this works quite well, giving busses mostly a green wave.

piinbinary · 2026-02-24T16:15:20 1771949720

I think it's missing Google's Bard

piinbinary · 2026-02-19T19:00:31 1771527631

That reminds me of `curl wttr.in/94110`

owenmarshall · 2026-02-19T19:25:15 1771529115

I also enjoy `finger <cityname>@graph.no`

piinbinary · 2026-02-19T03:26:14 1771471574

> friends don’t just bring up type inference in casual conversation

I wonder if this is a reference to "I need you to understand that people don't have conversations where they randomly recommend operating systems to one another"

But to the actual point of the article: my understanding is that there are areas where you can use bidirectional typing (e.g. languages that have subclasses) where HM style type inference might become undecidable.

laksjhdlka · 2026-02-19T03:33:11 1771471991

I once studied proof theory for a summer at a school in Paris and we talked about type inference and theorem proving all the time in casual conversation, over beers, in the park. It was glorious.

Being a student is so much fun, and we often waste it, or at least don't value it as much as we ought. 20 years later I'd love to go back.

magicalhippo · 2026-02-19T04:47:52 1771476472

> Being a student is so much fun, and we often waste it, or at least don't value it as much as we ought. 20 years later I'd love to go back.

An aside, but some years ago I watched the demo 1995 by Kewlers and mfx[1][2] for the first time and had a visceral reaction precisely due to that, thinking back to my teen years tinkering on my dad's computer, trying to figure out 3D rendering and other effects inspired by demos like Second Reality[3] or Dope[4].

I seldom become emotional but that 1995 demo really brought me back. It was a struggle, but the hours of carefree work brought the joy of figuring things out and getting it to work.

These days it's seldom I can immerse myself for hours upon hours in some pet project. So I just look things up on the internet. It just doesn't feel the same...

[1]: https://www.youtube.com/watch?v=weGYilwd1YI

[2]: https://www.pouet.net/prod.php?which=25783

[3]: https://www.pouet.net/prod.php?which=63

[4]: https://www.pouet.net/prod.php?which=37

mietek · 2026-02-19T06:59:48 1771484388

Join us in ##dependent on Libera IRC. We continue to talk about this stuff all the time, with a focus on Martin-Löf intuitionistic type theory.

thunderseethe · 2026-02-19T03:46:43 1771472803

> I wonder if this is a reference to "I need you to understand that people don't have conversations where they randomly recommend operating systems to one another"

It is!

> my understanding is that there are areas where you can use bidirectional typing (e.g. languages that have subclasses) where HM style type inference might become undecidable

There are! Afaik most languages end up with a bidirectional system in practice for this reason. Haskell started out HM and has shifted to bidir because it interacts better with impredicative types (and visible type applications). Bidir can handle fancy features like subtyping and all sorts of nifty stuff.

bboreham · 2026-02-19T07:43:58 1771487038

The subject does sometimes come up in my casual conversations, since Robin Milner was my first CS lecturer.

He never actually spoke about type inference in my presence. He did teach me CCS (pi-calculus predecessor) a couple of years later, by which time I could appreciate him.

piinbinary · 2026-02-14T16:53:54 1771088034

I'm impressed with how approachable the explanation is!

piinbinary · 2026-02-09T12:53:53 1770641633

That stuck out to me as well.

I wonder if there could be a bug where extra code runs but the result is discarded (and the code that runs happens to have no side effects).

The post also says

> That is roughly 1 billion iterations

but that doesn't sound right because GCC's version runs in only 0.047s, and no CPU can do a billion iterations that quickly.