Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The source code is all the supporting code needed to run inference on the weights. This is usually python and in the case of llama it's already open source. Usually the source code is referred to as the "model". You can kind of think of the weights as a settings file in a normal desktop application. The desktop app has its own source code and loads in the settings file at runtime. It can load different settings files for different behaviors.


This is almost completely wrong. When peope who work in AI refer to the "model", they are generally referring to the weights. It is the weights which are the most important determinant of how the model performs, and it is the weights that require the most resources to develop. Associated code and other assets are also important, but they not the core asset. The intuitive sense of open sourcing a model therefore typically means releasing the weights under an open licence (ideally along with the training and inference code, data, training info, etc).


I am not making a value judgement on what's the "most important" aspect when comparing the code vs the weights. I am just explaining the terminology as I understand it. Your intuitive sense of open sourcing certainly makes sense to me. I think a lay person would expect to be able to generate content with an "open source ai model" and that wouldn't be possible if only the code was open sourced and not the weights.

If you can show me people who work in AI calling just the weights a "model" then I would happily update my internal definition of the word. I am certainly not an expert in the subject, I am just going off what I've read from the community over the past few years.


Open source is about freedom to modify the product. So in the context of an LLM, the source code is the data and the code that processes the data during *training* (not only inference), as that is what generates the weights.


I thought model is the output of training. It's a binary file black box. That's what I had read somewhere.


I think it's a little context dependent, and the definition seems to be fluid right now. I've seen "model" be used to refer to just the code, or to refer to the combination of the code and weights. I don't think I've seen it used to refer to just the weights, but I wouldn't be surprised if its used that way in some contexts.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: