Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

How are LLM’s increasing their context size? I guess you just increase input size if it’s for the self supervised GPT3 style training but for RLHF? Are they creating datasets of books to input to the LLM and then making human labelers label the response? There might be a smart way that does not involve new datasets


Mosaic wrote about their new model here. https://www.mosaicml.com/blog/mpt-7b It was trained on 65k inputs and has decent performance working with 80k+ tokens.


I don't think RLHF datasets need to take full advantage of the context window. There's also many ways to programatically generate NLP datasets.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: