Hacker Newsnew | past | comments | ask | show | jobs | submit | scott_s's commentslogin

You are correct, but this is not a new role. AI effectively makes all of us tech leads.


That's not what the author means. Multiple times a day, I have conversations with LLMs about specific code or general technologies. It is very similar to having the same conversation with a colleague. Yes, the LLM may be wrong. Which is why I'm constantly looking at the code myself to see if the explanation makes sense, or finding external docs to see if the concepts check out.

Importantly, the LLM is not writing code for me. It's explaining things, and I'm coming away with verifiable facts and conceptual frameworks I can apply to my work.


Yeah, it's a great way for me to reduce activation energy to get started on a specific topic. Certainly doesn't get me all the way home, but cracks it open enough to get started.


I kinda wonder to what extent grad students’ experience grading projects and homework will end up being a differentiating skill. 75% kidding.


I think of, and look up, this drunken rant at least once a year.


It doesn't. arXiv is exclusively a pre-print service. The ACM digital library is for peer-reviewed, published papers. All of the peer-review happens through the ACM, as well as the physical conferences where people present and publish their papers.


The peer review is all done by volunteers of conferences, not ACM.


Yes, and that peer review happens through the ACM. It serves an organizing function. The conferences themselves are also in-person events, and most of the important research papers come out of those conferences.


IEEE may do it, as it's a professional organization. That is, they're a non-profit dedicated to the furtherance of the field. Being open access fits their mission, and the costs can be handled by dues and fees. Springer and Elsevier are for-profit publishers. I don't know how if they can have an open-access business model.


Great news. They temporarily opened it in 2020 during the pandemic. I argued it should remain so in a post: https://www.scott-a-s.com/acm-digital-library-should-remain-.... I'm glad it's finally happened.


Gather metrics and regularly report them.


Agreed. In grad school, I used Perl to script running my benchmarks, post-process my data and generate pretty graphs for papers. It was all Perl 5 and gnuplot. Once I saw someone do the same thing with Python and matplotlib, I never looked back. I later actually started using Python professionally, as I believe lots of other people had similar epiphanies. And not just from Perl, but from different languages and domains.

I think the article's author is implicitly not considering that people who were around when Perl was popular, who were perfectly capable of "understanding" it, actively decided against it.


That's true of all renewable energy sources. So we should take advantage of all of them, as much as is feasible.


> Of course, because I am not new to the problem, whereas an LLM is new to it every new prompt.

That is true for the LLMs you have access to now. Now imagine if the LLM had been trained on your entire code base. And not just the code, but the entire commit history, commit messages and also all of your external design docs. And code and docs from all relevant projects. That LLM would not be new to the problem every prompt. Basically, imagine that you fine-tuned an LLM for your specific project. You will eventually have access to such an LLM.


AI training doesn't work like that. you don't train it on context, you train it on recognition and patterns.


You train on data. Context is also data. If you want a model to have certain data, you can bake it into the model during training, or provide it as context during inference. But if the "context" you want the model to have is big enough, you're going to want to train (or fine-tune) on it.

Consider that you're coding a Linux device driver. If you ask for help from an LLM that has never seen the Linux kernel code, has never seen a Linux device driver and has never seen all of the documentation from the Linux kernel, you're going to need to provide all of this as context. And that's both going to be onerous on you, and it might not be feasible. But if the LLM has already seen all of that during training, you don't need to provide it as context. Your context may be as simple as "I am coding a Linux device driver" and show it some of your code.


Why haven’t the bug AI companies been pursuing that approach, vs just ramping up context window size?


Well, we don't really know if they aren't doing exactly that for their internal code repos, right?

Conceptually, there is no difference between fine-tuning the LLM for being a law expert of specific country and fine-tuning the LLM for being an expert for given codebase. Former is already happening and is public. Latter is not yet public but I believe it is happening.

Reason why big co are pursuing generic LLMs is because they serve as a foundation for basically any other derivative and domain-specific work.


Because training one family of models with very large context windows can be offered to the entire world as an online service. That is a very different business model from training or fine-tuning individual models specifically for individual customers. Someone will figure out how to do that at scale, eventually. It might require the cost of training to reduce significantly. But large companies with the resources to do this for themselves will do it, and many are doing it.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: