Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Okay I will be honest, I was so hyped up about This model but then I went to localllama and saw it that the:

120 B model is worse at coding compared to qwen 3 coder and glm45 air and even grok 3... (https://www.reddit.com/r/LocalLLaMA/comments/1mig58x/gptoss1...)



Qwen3 Coder is 4x its size! Grok 3 is over 22x its size!

What does the resource usage look like for GLM 4.5 Air? Is that benchmark in FP16? GPT-OSS-120B will be using between 1/4 and 1/2 the VRAM that GLM-4.5 Air does, right?

It seems like a good showing to me, even though Qwen3 Coder and GLM 4.5 Air might be preferable for some use cases.


It's only got around 5 billion active parameters; it'd be a miracle if it was competitive at coding with SOTA models that have significantly more.


On this bench it underperforms vs glm-4.5-air, which is an MoE with fewer total params but more active params.


That's SVGBench, which is a useful benchmark but isn't much of a test of general coding


Hm alright, I will see how this model actually plays around instead of forming quick opinions..

Thanks.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: