Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

We will buy 4 cards if they are 48 GB or more. At a measly 16 GB, we’re just going to stick with 3090s, P40s, MI50s, etc.

> 3x VRAM speed and 3x compute

LLM scaling doesn’t work this way. If you have 4 cards, you may get 2x performance increase if you use vLLM. But you’ll also need enough VRAM to run FP8. 3 cards would only run at 1x performance.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: