mtw14's comments

mtw14 · 2026-02-24T17:39:05 1771954745

What was the gap you discovered that made it not shippable? This is an experimental project, so I'm curious to know what sorts of problems you ran into when you tried a similar approach.

chaboud · 2026-02-26T16:23:00 1772122980

Three things:

1. Confirmable, predictable behavior (can we test it, can we make assurances to customers?).

2. Comparative performance (having an LLM call to extract from a list in 100s of ms instead of code in <10ms).

3. Operating costs. LLM calls are spendy. Just think of them as hyper-unoptimized lossy function executors (along with being lossy encyclopedias), and the work starts to approach bogo algorithm levels of execution cost for some small problems.

Buuuuuut.... I had working functional prototype explorations with almost no work on my end, in an hour.

We've now extended this thinking to some experience exploration builders, so it definitely has a place in the toolbox.

mtw14 · 2026-02-24T17:36:48 1771954608

I'm wondering if the post-condition checks change the perspective on this at all, because yes the code is nondeterministic and may execute differently each time. That is the problem this is trying to solve. You define these validation rules and they are deterministic post-condition checks that retry until the validation passes (up to a max retry number). So even if the model changes, and the behavior of that model changes, the post-condition checks should theoretically catch that drift and correct the behavior until it fits the required output.