I would say 27B matches with Sonnet 4.0, while 397B A17B matches with Opus 4.1. They are indeed nowhere near Sonnet 4.5, but getting 262144 context length at good speed with modest hardware is huge for local inference.
You mean 35B A3B? If this is shit, this is some of the best shit out I've seen yet. Never in a million years did I think I'd have an LLM running locally, actually writing code on my behalf. Accurately too.
none of the qwen 3.5 models are anywhere near sonnet 4.5 class, not even the largest 397b.
BUT 27b is the smartest local-sized model in the world by a wide wide margin. (35b is shit. fast shit, but shit.)
benchmarks are complete, publishing on Monday.