More

scott_s · 2026-03-06T19:17:31 1772824651

You are correct, but this is not a new role. AI effectively makes all of us tech leads.

scott_s · 2026-03-06T19:15:06 1772824506

That's not what the author means. Multiple times a day, I have conversations with LLMs about specific code or general technologies. It is very similar to having the same conversation with a colleague. Yes, the LLM may be wrong. Which is why I'm constantly looking at the code myself to see if the explanation makes sense, or finding external docs to see if the concepts check out.

Importantly, the LLM is not writing code for me. It's explaining things, and I'm coming away with verifiable facts and conceptual frameworks I can apply to my work.

phil21 · 2026-03-06T19:21:20 1772824880

Yeah, it's a great way for me to reduce activation energy to get started on a specific topic. Certainly doesn't get me all the way home, but cracks it open enough to get started.

bee_rider · 2026-03-06T19:21:38 1772824898

I kinda wonder to what extent grad students’ experience grading projects and homework will end up being a differentiating skill. 75% kidding.

scott_s · 2026-02-03T15:53:52 1770134032

I think of, and look up, this drunken rant at least once a year.

scott_s · 2026-01-03T00:54:10 1767401650

It doesn't. arXiv is exclusively a pre-print service. The ACM digital library is for peer-reviewed, published papers. All of the peer-review happens through the ACM, as well as the physical conferences where people present and publish their papers.

jules · 2026-01-03T13:15:28 1767446128

The peer review is all done by volunteers of conferences, not ACM.

scott_s · 2026-01-08T02:32:11 1767839531

Yes, and that peer review happens through the ACM. It serves an organizing function. The conferences themselves are also in-person events, and most of the important research papers come out of those conferences.

scott_s · 2026-01-01T16:52:58 1767286378

IEEE may do it, as it's a professional organization. That is, they're a non-profit dedicated to the furtherance of the field. Being open access fits their mission, and the costs can be handled by dues and fees. Springer and Elsevier are for-profit publishers. I don't know how if they can have an open-access business model.

scott_s · 2026-01-01T16:49:12 1767286152

Great news. They temporarily opened it in 2020 during the pandemic. I argued it should remain so in a post: https://www.scott-a-s.com/acm-digital-library-should-remain-.... I'm glad it's finally happened.

scott_s · 2025-12-05T16:04:25 1764950665

Gather metrics and regularly report them.

scott_s · 2025-11-19T19:39:29 1763581169

Agreed. In grad school, I used Perl to script running my benchmarks, post-process my data and generate pretty graphs for papers. It was all Perl 5 and gnuplot. Once I saw someone do the same thing with Python and matplotlib, I never looked back. I later actually started using Python professionally, as I believe lots of other people had similar epiphanies. And not just from Perl, but from different languages and domains.

I think the article's author is implicitly not considering that people who were around when Perl was popular, who were perfectly capable of "understanding" it, actively decided against it.

scott_s · 2025-08-27T13:10:24 1756300224

That's true of all renewable energy sources. So we should take advantage of all of them, as much as is feasible.

scott_s · 2025-08-13T14:33:35 1755095615

> Of course, because I am not new to the problem, whereas an LLM is new to it every new prompt.

That is true for the LLMs you have access to now. Now imagine if the LLM had been trained on your entire code base. And not just the code, but the entire commit history, commit messages and also all of your external design docs. And code and docs from all relevant projects. That LLM would not be new to the problem every prompt. Basically, imagine that you fine-tuned an LLM for your specific project. You will eventually have access to such an LLM.

snowfield · 2025-08-14T12:15:53 1755173753

AI training doesn't work like that. you don't train it on context, you train it on recognition and patterns.

scott_s · 2025-08-14T14:07:59 1755180479

You train on data. Context is also data. If you want a model to have certain data, you can bake it into the model during training, or provide it as context during inference. But if the "context" you want the model to have is big enough, you're going to want to train (or fine-tune) on it.

Consider that you're coding a Linux device driver. If you ask for help from an LLM that has never seen the Linux kernel code, has never seen a Linux device driver and has never seen all of the documentation from the Linux kernel, you're going to need to provide all of this as context. And that's both going to be onerous on you, and it might not be feasible. But if the LLM has already seen all of that during training, you don't need to provide it as context. Your context may be as simple as "I am coding a Linux device driver" and show it some of your code.

jimbokun · 2025-08-13T16:21:35 1755102095

Why haven’t the bug AI companies been pursuing that approach, vs just ramping up context window size?

menaerus · 2025-08-14T08:24:05 1755159845

Well, we don't really know if they aren't doing exactly that for their internal code repos, right?

Conceptually, there is no difference between fine-tuning the LLM for being a law expert of specific country and fine-tuning the LLM for being an expert for given codebase. Former is already happening and is public. Latter is not yet public but I believe it is happening.

Reason why big co are pursuing generic LLMs is because they serve as a foundation for basically any other derivative and domain-specific work.

scott_s · 2025-08-13T18:49:13 1755110953

Because training one family of models with very large context windows can be offered to the entire world as an online service. That is a very different business model from training or fine-tuning individual models specifically for individual customers. Someone will figure out how to do that at scale, eventually. It might require the cost of training to reduce significantly. But large companies with the resources to do this for themselves will do it, and many are doing it.