Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Fair points! Ye Pytorch's fp8 experimental support does scaling of the gradients. Interesting point on a larger range for the forward pass, and a small range for the gradients! I did not know that - so learnt something today!! Thanks! I'll definitely read that paper!


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: