The big question is whether Apple can keep shipping new models constantly.
AFAIK the current model is on par with with Qwen-3-4B, which is from a year ago [0]. There's a big leap going from last year Qwen-3-4B to Qwen-3.5-4B or to Gemma 4.
Apple model is nice since you don't need to download anything else, but I'd rather use the latest model than to use a model from a year ago.
I'm curious about the multimodal capabilities on the E2B and E4B and how fast is it.
In ChatGPT right now, you can have a audio and video feed for the AI, and then the AI can respond in real-time.
Now I wonder if the E2B or the E4B is capable enough for this and fast enough to be run on an iPhone. Basically replicating that experience, but all the computations (STT, LLM, and TTS) are done locally on the phone.
I just made this [0] last week so I know you can run a real-time voice conversation with an AI on an iPhone, but it'd be a totally different experience if it can also process a live camera feed.
I just want to say thanks. Finding out about these kind of projects that people are working on is what I come to HN for, and what excites me about software engineering!
The E2B and E4B models support 128k context, not 256k, and even with the 128k... it could take a long time to process that much context on most phones, even with the processor running full tilt. It's hard to say without benchmarks, but 128k supported isn't the same as 128k practical. It will be interesting to see.
Forget developing countries, iPhone is a luxury even in some European countries, when rent is 500+ Euros and your take home pay is ~1000. After all the other bills you're not left with iPhone money, which is why 100-200 Euro models of Chinese brands are doing so well.
It's easier to name the countries where iPhone ISN'T a luxury, as you can count them on very few hands.
Many countries would develop much faster if there weren't bombed nor maintain by puppet dictactors from (economically) developped nations (USA and france keep doing this intensively, while countries like Germany dont mind supporting fascist states). (PS: I'm not woke, not even Marxist).
Wow it's true. Anthropic actually had me fooled. I saw the GitHub repository and just assumed it was open source. Didn't look at the actual files too closely. There's pretty much nothing there.
So glad I took the time to firejail this thing before running it.
Not really, except that they have a bunch of weird things in the source code and people like to make fun of it. OpenCode/Codex generally doesn't have this since these are open-source projects from the get go.
That’s awesome! I’ve got a similar project for macOS/ iOS using the Apple Intelligence models and on-device STT Transcriber APIs. Do you think it the models you’re using could be quantized more that they could be downloaded on first run using Background Assets? Maybe we’re not there yet, but I’m interested in a better, local Siri like this with some sort of “agentic lite” capabilities.
> Do you think it the models you’re using could be quantized more that they could be downloaded on first run using Background Assets?
I first tried the Qwen 3.5 0.8B Q4_K_S and the model couldn't hold a basic conversation. Although I haven't tried lower quants on 2B.
I'm also interested on the Apple Foundation models, and it's something I plan to try next. AFAIK it's on par with Qwen-3-4B [0]. The biggest upside as you alluded to is that you don't need to download it, which is huge for user onboarding.
Subjectively, AFM isn’t even close to Qwen. It’s one of the weakest models I’ve used. I’m not even sure how many people have Apple Intelligence enabled. But I agree, there must be a huge onboarding win long-term using (and adapting) a model that’s already optimized for your machine. I’ve learned how to navigate most of its shortcomings, but it’s not the most pleasant to work with.
Totally agree. There are significantly more new apps being released. I've been visiting the /r/macapps subreddit and they're having trouble filtering new submissions. I generally like the direction that they're taking https://www.reddit.com/r/macapps/comments/1ryaeex/rmacapps_m...
Even though it's more troublesome to submit apps to App Store, it's one signal that the app is not a malware.
Wow, this subreddit looks like the apocalypse of vibe coded projects/apps. Kind of similar to what happened to "show HN". Too many ideas, not enough problems to solve, and likely bad implementations. The result is that nobody uses any of the apps.
In AI conversations, people often forget that at the end of a day, an actual human needs to use your stuff.
Does this include any provider that does not fall under USA CLOUD Act? This vulnerability disclosure timeline is a nightmare for us Europeans, it was fully disclosed yesterday late afternoon for us and I can trace back attack logs that happend during the night. I expect some downfalls from this.
I genuinely believe Next.JS is a great framework, but as an European developer working on software that should not touch anything related to CLOUD Act you're just telling me that Next.JS and React, despite being OSS, is not made for me anymore.
It’s infuriating how US-centric some OSS maintainers can be. Really sad if the OOS ecosystem also have to fragment into pieces like much of the internet is starting to.
AFAIK the current model is on par with with Qwen-3-4B, which is from a year ago [0]. There's a big leap going from last year Qwen-3-4B to Qwen-3.5-4B or to Gemma 4.
Apple model is nice since you don't need to download anything else, but I'd rather use the latest model than to use a model from a year ago.
https://machinelearning.apple.com/research/apple-foundation-...
reply