Hacker Newsnew | past | comments | ask | show | jobs | submit | fromlogin
Reinforcement Learning (I.e. Policy Gradient Algorithms) (rlhfbook.com)
2 points by vinhnx 16 days ago | past
Reinforcement Learning from Human Feedback (rlhfbook.com)
133 points by onurkanbkrc 54 days ago | past | 5 comments
RLHF Book (rlhfbook.com)
479 points by jxmorris12 on Feb 1, 2025 | past | 37 comments

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: