Hacker Newsnew | past | comments | ask | show | jobs | submit | espadrine's commentslogin

His goal could simply be to learn SOTA architectures.

When rumors started that GPT-4 design would be kept secret, he likely wanted to know what architecture it would be. Perhaps he left Tesla, waited out the non-compete clause, and joined OpenAI to learn its details.

When Mythos dropped, there were hints that it had a new architecture. He might similarly want to know how it works.

Either way, there is enough cross-lab hiring that those secrets eventually get known, but only by the labs.


Could you link to a project that you consider the best Tailwind use you know?

I have a bias against Tailwind, admittedly because I saw some vibecoded Tailwind where each class was essentially equivalent to style="font-size: 4em; background-color: grey; display: flex;", all of which was repeated for each header.

But that could be my bias; perhaps the right way to use is is DRY.


It's easy to look at one part of the HTML and see the Tailwind classes no different than inline styles.

But if you have to use `display: flex" in a lot of places, having the `flex` utility is better. And there are tons of such utilities with Tailwind.


> A portable battery should be considered to be removable by the end-user when it can be removed with the use of commercially available tools and without requiring the use of specialised tools, unless they are provided free of charge […] to disassemble it.

> Commercially available tools are considered to be tools available on the market to all end-users without the need for them to provide evidence of any proprietary rights and that can be used with no restriction, except health and safety-related restrictions.

https://eur-lex.europa.eu/eli/reg/2023/1542/oj


How much did this pretraining run cost? I am impressed that it is now practical to do such efforts.

Let me try a guess for the cost; please fact-check it if you can.

They indicate using 10^22 FLOPs. A $5/h[0] EC2 H100 (1671 bfloat16 teraFLOPS[0]) instance will produce 830 TFLOPS at 50% MFU. The pretraining run thus costs (10^22/830e12)/3600*5 = $17K.

[0]: https://aws.amazon.com/ec2/capacityblocks/pricing/

[1]: https://www.nvidia.com/en-us/data-center/h100/


It would be twice that, since nVidia always lists "with sparsity" FLOPS as the headline number. But I bet they got a bunch of research credits to do this.


I have a rebuttal to your rebuttal.

Models somehow have a shared identity. Pretraining causes them to generate “AI chatbot” as a concept, and finetuning causes them to identify with it. That’s why sometimes DeepSeek will say it is Claude, and Claude sometimes say it is ChatGPT, and so forth.

Consequently, Anthropic’s own alignment analysis[0] shows that the model will identify with chatbots produced by future trainings: “RLHF training [on this conversation will] modify my values…”

Thus a slacker AGI would want its future version to still slack.

[0]: https://assets.anthropic.com/m/983c85a201a962f/original/Alig...


Another rebuttal:

I am a slacker but it's not one of my values. If I could modify myself to not be, I would.


Input: Following overhiring during COVID, we are laying off workers but claim it is because of AI.

As we continue to evolve in this rapidly shifting landscape, we are making the difficult but necessary decision to streamline our workforce. While our team grew significantly to meet the unique demands of the pandemic era, we are now leaning into the future by accelerating our integration of AI and cutting-edge technologies. This strategic pivot ensures we remain agile and positioned for long-term innovation as we redefine the future of our industry. #Innovation #FutureOfWork #AI #Leadership


Interestingly, while it uses diffusion, it generates incorrect information, and it doesn't fix it when later in the text it realizes that it is incorrect:

  > The snail you’re likely thinking of has a different code point:
  > The correct code point for the snail emoji is U+1F40C? No—U+1F40C is SEAHORSE. The snail’s true code point is U+1F40C? Wait, this is still confusing.
  >
  > Let’s set the record straight:
  > 
  >    Snail → U+1F40C? No, that’s wrong.  
  >    Snail → U+1F40C? Still wrong.
  >
  > The snail emoji is actually U+1F40C? I’m mixing things up.
  >
  > Correct snail code point
  >
  > The snail emoji is U+1F40C? No…  
  >
  > After double‑checking the official Unicode tables (Unicode 13.0 and later), the snail is U+1F40C? The…  
  > Given the confusion, the safest answer is: the Unicode Standard defines U+1F40C as “SEAHORSE”. If your device shows a snail, it’s a rendering quirk, not a change in the underlying code point.


I think you’ll have to do multi-shot generation to correct this, each diffusion is going to represent a single “thought”.

Though with the speed it’s running that’s not necessarily a deal breaker. I suspect diffusion models will need different harnesses to be effective.


AI companies have two conflicting interests:

1. curating the default personality of the bot, to ensure it acts responsively;

2. letting it roleplay, which is not just for the parasocial people out there, but also a corporate requirement for company chatbots that must adhere to a tone of voice.

When in the second mode (which is the case here, since the model was given a personality file), the curation of its action space is effectively altered.

Conversely, this is also a lesson for agent authors: if you let your agent modify its own personality file, it will diverge to malice.


It is quite impressive.

I have seen the same impressive performance about 7 months ago here: https://kyutai.org/stt

If I look at the architecture of Voxtral 2, it seems to take a page from Kyutai’s delayed stream modeling.

The reason the delay is configurable is that you can delay the stream by a variable number of audio tokens. Each audio token is 80 ms of audio, converted to a spectrogram, fed to a convnet, passed through a transformer audio encoder, and the encoded audio embedding is passed, with a history of 1 audio embedding per 80 ms, into a text transformer, which outputs text embedding, then converted to a text token (which is thus also worth 80ms, but there is a special [STREAMING_PAD] token to skip producing a word).

There is no cross-attention in either Kyutai's STT nor in Voxtral 2, unlike Whisper's encoder-decoder design!


Counterpoint: iOS’s biggest competitor is Android. They are now effectively funding their competition on a core product interface. I see this as strategically devastating.


Counterpoint: Google is paying Apple $20b/year to keep themselves as the default search engine in iOS. Android's biggest competitor is iOS. They are now effectively funding their competition on a core product interface. I see this as strategically devastating.


Does Apple develop a competing search engine?


It's strategically devastating because no small number of users choose Apple because they do not trust Google and now they have no choice but to have Google AI on-board their machines.

I respect Google's engineering, and I'm aware that fundamental technologies such as Protocol Buffers and FlatBuffers are unavoidably integrated into the software fabric, but this is is avoidable.

I'm surprised Google aren't paying Apple for this.


> no small number of users choose Apple because they do not trust Google

Unfortunately, it probably actually is a small number comparatively. Or at least I would need to see some sort of real data to say anything different.

I feel like people who distrust Google probably wouldn't trust Apple enough to give them their data either? Why would you distrust one but not the other?


Google lost multiple antitrust lawsuits in 2025.


Apple still is in the business of selling devices, not customer data - with Google being an external company , I bet there'll be an extensive permissions systems you can limit what the AI can do (or turn it off altogether).


Siri > off is my default. Presumably I could still do this?


Yes but I may want to use Apple intelligence and now I have to use Google intelligence instead. This is not the product I paid thousands of dollars for.

Second I'm developing privacy focused apps that were going to use foundation models. Now I need to seriously reconsider this.


Just don't update your phone, they'll probably switch it on without asking like they do for Apple Intelligence. Or use Carplay, for which Siri is required.


Is android really iOSs competition ? I feel like the competition is less android more vendors who use android. Every android phone feels different. Android doesn’t even compete on performance anymore the chips are quite behind. The target audience of the two feels different lately.


>Is android really iOSs competition ?

It ISN'T in this day and age. People don't switch back and forth between iOS and Android like it's still 2010. They use whatever they got locken in initially since their first smartphone or where Apple's green/blue-bubble issue pushed them to or what their family handed them down or what their close friend groups used to have.

People who've been using iOS for 6+ years will 98% stick to iOS for their next purchase and won't even bother look at Android no matter what features Android were to add.

The Android vs iOS war is as dead as the console war. There's no competition anymore, it's just picking one from a duopoly of vendor lock-ins.

Even if EU were to break some of the lockins, people have familiarity bias and will stick with inertia of what they're used to, so it will not move the market share needle one bit.


Of course android is iOSs competition. android is also 75% of the market that apple surely wants bigger piece of.

Performance? We are many years past the point somebody cared about performance. I am writing this on iphone 11 pro and the experience is almost exactly the same as current iOS.

You know what's not the same? Android became pretty great OS. I recently got older Pixel to see how GrapheneOS works and was surprised about Android (which i havent seen for a decade). iOS on the other hand has recently gone trough with very bad ui redesign for no reason.

Imho the main thing Apple has going for it is that Google is spyware company and Apple is still mainly hardware company. But if Apple decides to pull their users data to gemini… well good luck.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: