Hacker Newsnew | past | comments | ask | show | jobs | submit | more syntaxing's commentslogin

What does heavy RL even mean…similar to how the CEO of cursor said how much better the perplexity got when it’s a terrible metric for model fine tune performance? Let’s be real here, it’s Kimi 2.5 fine tuned for Cursor. There’s nothing wrong with that but they tried to hide it and it’s some work they put in but nothing close to training a model of their own.


60B for Composer 2…that is built from Kimi K2… what ever happened to “Grok being the best”?


Am I the only one that thinks Composer is really good, when you factor in the speed and the cost?


I don’t doubt it is. End of the day, it’s a fine tuned Kimi. They tried to hide it and making their work sound more impressive than it is. It’s easy to have stuff be cheap when you don’t have to train your own model from scratch.


Composer is clearly dumber than the rest but then I only ask it dumb questions and it answers them really quickly.


yes, you are


With GitHub and Anthropic reducing subscription features, Chinese providers are looking more and more tempting.


Until you work for a company or government agency that is subject to any sort of technology audit. The moment offshore processes running in China comes up you'll have a never ending hole of questions to answer.


As the engineering saying goes, nothing more permanent than a temporary solution


3.4B in 4.5 months…is that all going to Anthropic? Makes it seem so with the wording and how they’re pivoting to Codex too


It's probably all AI spending, including them doing AI stuff for their products.


oh man uber is acquiring the company I work for [1] and we currently really like Claude ... but if Codex is better so be it. I just really, really, really like Claude Code as a front end. Guess I'll have to make it talk Codex instead.

[1] it's public knowledge https://investor.uber.com/news-events/news/press-release-det...


Curious how it works in other countries, do employees get a portion of the payout?


We have virtual shares that vest on sale. I won’t be rich. Enough to take a nice trip to Mallorca after taxes.


That’s the entire R&D budget - the article is completely lacking in actual details, such as how much was spent on Ai


If it is anything like my company, sign enormous deals to AI startups that have existed for 8 months, and do little more than provider wrappers around someone else's model. Then hire three different firms that do the same thing because each division has to prove how much more AI they are than the others. Have a handful of internal engineers who have no idea what they are doing, but get approval to build and run an internal B200 server farm. Ensure any big jobs are done through some kind of white-glove offering from Amazon/Azure that removes complexity, but charges astronomical rates.


"My delivery service CEO told me the AI keep eating his tokens so I asked how many tokens he has and he said he just goes to the token shop and gets a new batch of tokens afterwards so I said it sounds like he’s just feeding tokens to the AI and then his laid off workers started crying."


> And the men that had spent longer looking after babies showed the largest drops in testosterone. Those that shared a bed with their infants also had lower levels.

Dad here. Maybe…it’s the lack of sleep? Involved fathers tend to have less sleep.


Parents also tend to gain weight, and higher BMI is associated with a decline in T.

https://pmc.ncbi.nlm.nih.gov/articles/PMC3809034/


Yes being a parent would tend to correlate to a drop in physical activities and sports


Do BBC have low T?


I'm not going to google that to find out what you meant.


It’s a bbc article you gross person


Yes, chronically disturbed sleep is the obvious confounder and is well known to drop T and explains the observed small changes a lot better.


If human babies actually evolved to be the terrors they are in order to lower fathers testosterone levels/chill them out that would be wild.


Or, just gonna put this out there... you have successfully fathered a child. A drop-off in T seems normal -- you've done your job and now you care for that child and lose the drive to father a significant number more. You accomplished your biological purpose and slowly slide on into death over the next number of decades. So it is. We are not immortals and the phases of life should not be avoided out of selfish vanity. Easy to say online, eh? :)


Several of the studies described changes in hormones before the child was born.


Extra time commitment, and therefor missing some sleep, can start before the baby is born.


Reminds me of when I would stay up late ironing my wife's maternity BDUs.


With or without starch? Please tell me you were taking care of boots as well!


I don't recall using starch on the BDUs, I might have polished the boots once or twice, but that was just over twenty years ago, so who knows.


If you cosleep with your 8 month pregnant wife she might not be sleeping well and by proximity you may not be sleeping well.


> Several of the studies described changes in hormones before the child was born.

For me, sleep dropped off right after I got the "I'm pregnant" phone call. I'd only known this girl for [time it takes a baby to be detected] days.


Given this is "BBC Future" let me guess, barely above significance and n=16?


I'm unfamiliar with the subject. What's the problem with BBC Future?


On the right line. Lower sleep, higher coping (bad diet, alcohol etc) would lead to T destruction. Not surprised BBC didn't connect the dots here.


Evolutionarily this makes sense. Lower testosterone means less carousing, and better fatherhood.


I went to college as a MechE so unsure if compsci was different. But overall, all the “fun” projects were labs. We have three semesters of hell and all 3 semesters had 2-3 labs, and we write 20 pages or so for EACH lab a week (usually a team of 2-3).


Is it worth running speculative decoding on small active models like this? Or does MTP make speculative decoding unnecessary?


Any way to run this on Gemma 4 only? If there was a “local” mode, I would seriously think about installing this.


Out of curiosity, why not just try it with one of the many local managers like LM Studio or Ollama or oMLX, etc?

The Gemini app is kind of terrible (apart from the models) but Gemma 4 runs great locally already.


I run lmstudio now and it’s more like a “chat” bot. Where as Gemini app is more like an agent.


You can enable the lm studio server and use any openai compatible harness to use the models that are running inside it. OpenCode, pi, even Claude and Codex...


I’ve been missing agentic capabilities from almost all local LLM apps. It’s like they’re all stuck in 2023.

That’s why I started using OpenCode for this. It works pretty well, the web UI comes pretty close to a general chat app. You can use folders to organize your sessions like projects (which annoyingly Gemini still doesn’t have) with files and extra instructions.

It’s pretty powerful.


OpenCode is one solution, but there are also several alternatives.

For example pi-dev, but even Codex is open source and it should work with any locally-hosted model, e.g. by using the OpenAI-compatible API provided by llama-server.

I have not used pi-dev until now, but the recent presentation of pi-dev by its developer (reported on other HN threads) has convinced me that he is among the people who can distinguish good from bad, which unfortunately cannot be said about many people creating AI applications.

So I intend to switch to using pi-dev as a coding assistant for my locally-hosted models, but I do not have yet results demonstrating that this is the right choice, besides its lead developer being more trustworthy than the others.


I too am interested in Pi and Codex, but haven’t seen any full-featured web UIs for them yet. Would be happy to know if there are some!

One thing I’m considering (depending on how happy I am with OpenCode after trying to remove some questionable functionality it has) would be to make Pi (or Codex) speak the OpenCode protocol so that its web UI can be used with it.


Which is actually very impressive. 5X is a good deal, given how much more R&D and economy of scale goes into lithium batteries. Flow batteries have 2-3X more cycles and way safer


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: