tomlis's comments

tomlis · 2026-02-17T21:09:11 1771362551

I really don't think that's true. Think Uber vs. Lyft. I know I distinguish between the two even if the experience is usually about the same and people I know where this has come up in conversation generally see Lyft as "off-brand" and a little more skeevy. They only take Lyfts when it's cheaper or quicker than Uber.

I'm probably not the average consumer in this situation but I was in Austin recently and took both Waymo and Robotaxi. I significantly preferred the Waymo experience. It felt far more integrated and... complete? It also felt very safe (it avoided getting into an accident in a circumstance where I certainly would have crashed).

I hope Tesla gets their act together so that the autonomous taxi market can engage in real price discovery instead of "same price as an Uber but you don't have to tip." Surely it's lower than that especially as more and more of these vehicles get onto the road.

Unrelated to driving ability but related to the brand discussion: that graffiti font Tesla uses for Cybertruck and Robotaxi is SO ugly and cringey. That alone gives me a slight aversion.

tomlis · 2026-02-17T20:55:07 1771361707

gpt-5.3-codex isn't available via the API yet. Pretty sure they were only testing via API access.

tomlis · 2026-02-06T01:03:16 1770339796

As a rule of thumb, most people who say things like "X is useless and a waste" or "Y is revolutionary and is going to change everything by tomorrow" when the dust hasn't even begun to settle are stupid, overly-excitable, too biased towards negative outlooks, and/or trying to sell you something.

Sometimes they have some good points so you should listen to what they have to say. But that doesn't mean you have to get absorbed into their world view. Just integrate what you see as useful from your current POV and move on.

tomlis · 2026-02-06T00:38:42 1770338322

Well, at some point it's up to us to say 'no.' Weekends have not always been a widely accepted ritual[1]. They only became one because of collective action.

Dedicating any and all of your free time to work only becomes a norm if we let it.

[1]: https://www.goodwinrecruiting.com/eight-hour-workdays-and-40...

kaashif · 2026-02-06T02:44:32 1770345872

Well, they did specify your _newly_ freed time. So if you work 8 hours now and AI lets you do that work in 4, then you'll just do double the work in 8 hours, not get more free time.

I don't think there's an obvious point to take collective action.

cyanydeez · 2026-02-06T11:28:45 1770377325

You meanitzup to unions.

tomlis · 2026-02-06T00:25:32 1770337532

The deterministic mixed with LLM approach has been great for me so far. I've been getting a lot of the gains the "do it all with AI" people have been preaching but with far fewer pitfalls. It's sometimes not as fluid as what you sometimes see with the full-LLM-agent setups but that's perfectly acceptable to me and I handle those issues on a case-by-case basis.

alexhans · 2026-02-06T00:55:10 1770339310

I'd argue that the moment one cares about accuracy and blast radius, one would natural want to reduce error compounding from a combination of LLM calls (non deterministic) and it's very natural to defer to well tested determinist tools.

Do one thing and do it well building blocks and the LLM acts a translation layer with reasoning and routing capabilities. Doesn't matter if it's one or an orchestrated swarm of agents.

https://alexhans.github.io/posts/series/evals/error-compound...

tomlis · 2026-02-06T01:16:49 1770340609

Yeah. One of the patterns I've fallen into looks a bit like this:

1. I have some new task I need/want to do.

2. For whatever reason, it's not something I want to do myself if I can avoid it.

3. Have the agent do it the first few times.

4. After those first few iterations, think about if it's something where the variability in the number of steps needed to complete the task is small enough to just put into a small script or service. If it is, either write the code myself or ask the agent to create draft code based on its own observations of how it did the task those first few times. If it's not, just keep having the agent do it.

5. A good chunk of the time, most of the task has low variability in what it needs to do except for just one portion. In that case, just use deterministic code for all areas of the program except the high variability area.

Probably a better word than "variability" for what I'm talking about but I think you get the idea. Spend a lot of tokens upfront so the tokens used later can be minimized when possible.

EDIT: Formatting.

alexhans · 2026-02-06T02:16:46 1770344206

Yeah, the idea is clear. You're "integrating early" and "failing fast" and once you've understood enough about the problem you can design and optimize the right custom tool to make it more accurate, consistent, cost-effective.

To be fair, it's a micro approach of the way to approach projects rapidly where instead of trying to design too much upfront, identify what are the real value producing goals, the risks in the middle that you can foresee and get hands on in a time-boxed manner to de-risk the individual points or understand what's not possible. Then you can actually come up with the right explanations for the design.