I've always been surprised Kimi doesn't get more attention than it does. It's always stood out to me in terms of creativity, quality... has been my favorite model for awhile (but I'm far from an authority)
Openrouter will route to china hosted models when there are US hosted providers of the same model. Is there a setting to set your preference or to blacklist providers like alibaba cloud for example?
I use OpenCode and the openrouter provider. From opencode I only select the model like kimi-2.6 and have no way of selecting which cloud hosting will receive my request.
Interesting that the best performers are all Chinese-made models (DeepSeek and Qwen also perform consistently well). I wonder if there's more focus on vision and illustration in their training, or if something else is leading to their clear lead on this one test.
I'm not really sure how this works, but I stayed on the page for a while, and then it reloaded and all clocks changed. I guess there's either a collection of different clocks generated by models, or maybe they're somehow generated in the real time, but the fact is what you see is not necessarily what I see.
It reruns a prompt every minute to all the models included. Everyone is gonna see something different but I've spent too long on it and there's a consistent pattern of Qwen and Kimi outperforming the others
This site was made months ago and it seems its only been updated with the latest model of a couple of the providers so keep in mind that many of the Chinese models haven't been updated
Seems like it regenerates them to reflect the current time. Funny to see how some models (like Kimi and Deepseek) sometimes get it right and other times fail miserably on the level of ancient models like GPT 3.5.
Kagi has it as an option in its Assistant thing, where there is naturally a lot of searching and summarizing results. I've liked its output there and in general when asked for prose that isn't in the list/Markdown-heavy "LLM style." It's hard to do a confident comparison, but it's seemed bold in arranging the output to flow well, even when that took surgery on the original doc(s). Sometimes the surgery's needed e.g. to connect related ideas the inputs treated as separate, or to ensure it really replies to the request instead of just dumping info that's somehow related to it.
The parent poster is probably referring to Kimi-Dev-72B¹, which is a much smaller and older model, while people are probably more familiar with the big and fairly powerful 1100B Kimi-K2.5².
Yes it was good for its time, but 10 months old now which is a long time ago in this space. It was also a fine-tune (albeit a good one) of Qwen-2.5 72B.
I wish they did more smaller models. Kimi Linear doesn't really count, it was more of a proof of concept thing.
I'm pretty confident that if Ai gets good enough to actually take our jobs, it's going to be good enough to take at least 80% of all knowledge work jobs, if not 100%.
In which case everybody is in the same boat and we'll all be in it together.
I don't really believe software dev is uniquely at risk to AI.
The catch-22 I run into with AI coding help is always: it helps the most with problems I know how to solve. I feel like most engineers run into problems where we can't fully articulate the problem we're having (otherwise we would be able to fix it). In which case AI can be helpful, but more in a google way.
I think that is both pretty true but massively underrated in how much faster you can solve the problems you know how to solve. I do also help it finds helps me more quickly learn how to solve new problems, but I must still must learn how to solve these new problems I have it solve those new problems or things go off the rails.
I think one of Apple's strengths since Tim Cook took over is their ability to avoid "gimmicks". As much criticism as people have of apple for not innovating on the iPhone, I appreciate their ability to not screw products up.
I'm not saying AI is a gimmick, but the caution they show is a good quality I think
Apple could have avoid that by released it half arsed like all the AI stuff, claim that it does all those things and write somewhere "AI may make mistakes".
I work in UI in enterprise, where slight color shade differences between releases can cause uproar. I cannot imagine the thought process behind liquid glass in any sense.
OSX's Aqua was also an insanely bold UI with a lot of gimmicks, but was still usable for the most part. I'm so very curious about the internal discussions around this.
Several of his “lieutenants” are following, actually.
His successor Stephen Lemay has exactly the kind of pedigree a person who cares about UI could ask for. There's a lot to be optimistic about. https://daringfireball.net/2025/12/bad_dye_job
I have no idea what's going on but Apple is an extremely top down place. Its entirely possible that Apple pivots on a dime after the departure of the baffoon.
They haven't really updated Siri though? That's still in the pipeline. So not a very fair comparison. The article states that they are behind and I think everyone knows that
AI isn't a gimmick, but a huge portion of the way it's presented to consumers is, especially given the fact that it never really was meant for consumers. As an Apple user, I'm thrilled at how "behind" they are.
But also, their tendency to "not fall from gimmicks" sometimes makes it so we didn't get a 2nd mouse button for decades. Ultimately, the way they implemented this was super cool, but still.
The balancing act of figuring out what you can reasonably rely on from an LLM and what you need to be skeptical or dismissive of is not the type of experience an iPhone user should be expected to navigate.
I was going to link you the Apple Vision Pro as a counterpoint, but after clicking the link and being reminded of what that product actually looks like, I really don't know what to say any more. I'm literally dumbfounded anyone could make your comment at all
To their credit, they specifically decided not to make a big deal out of AR like Meta did and keep production small and expensive. They realized the tech wasn't ready for a mass adoption campaign. I'd say Apple, overall, has been pretty cautious with AR. I wouldn't be surprised if they even have the guts to cancel that project entirely like they did with self-driving cars
That's not credit at all. If your strongest defense of AVP is "at least they're not Meta" then you've stopped making grounded observations and gone straight to ad-hominem.
I'd also go as far as to say that Apple knew they could have made the Vision Pro better. It should be running a real computer operating system like the headset Valve is making, and Apple knows that. The arbitrary insistence on iPad-tier software in a $3,500 headset guaranteed it was unlovable and dead-on-arrival.
I ran into an AVP recently and it actually is a great piece of hardware. It only has two issues: price and software. The former is forgivable because it really is an amazing piece of hardware and the price is justified. The latter is not and is the original sin that has killed it.
There's an unfulfilled promise of spatial computing. I wish I could load up my preferred CAD program and have wide and deep menus quickly traversable with hand gestures. Barring that the least it could do is support games. Maybe if some combination of miracle shims (fex emu, asahi, w/e) were able to get onto the platform it might be savable. The input drivers alone would be a herculean task.
I find it weird that people aren't questioning the motives behind these changes a little more. Is it not strange this is coming at the same time as the UK Online Safety Act?
There are reports of posts related to the middle eastern conflict being censored. Somehow I don't think this about violence and adult content.
Depends on if the US emperor and his cronies have the UK's backs on this issue. If they don't, calling the bluff would work, there's zero chance the UK gov would ban Apple products without US approval. The backlash among the public would be far worse than the TikTok ban. Imagine all companies using Macs. The order of power here is US > Apple > UK.
I hope they do check just for the sake of entertainment, but I can't imagine gold would be worth that much more if it was empty. I just googled and there's supposed to be about ~280 billion dollars of gold there. Which is a lot but I think there's about $12 trillion of gold in circulation
reply