Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
The tug-of-war between cache and capacity: from MHA, MQA, GQA to MLA (yuxi-liu-wired.github.io)
1 point by YuxiLiuWired on Feb 3, 2025 | hide | past | favorite


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: