Qwen3 Coder is 4x its size! Grok 3 is over 22x its size!
What does the resource usage look like for GLM 4.5 Air? Is that benchmark in FP16? GPT-OSS-120B will be using between 1/4 and 1/2 the VRAM that GLM-4.5 Air does, right?
It seems like a good showing to me, even though Qwen3 Coder and GLM 4.5 Air might be preferable for some use cases.
120 B model is worse at coding compared to qwen 3 coder and glm45 air and even grok 3... (https://www.reddit.com/r/LocalLLaMA/comments/1mig58x/gptoss1...)