I'm too concerned with data exfiltration to use many AI services unless their terms of service state they will not use your data for training or anything else. Zero retention is what I'm looking for. I care because I frequently work on proprietary code that I do not personally own (as most employed software devs do). So if I am using an AI service with proprietary code, I want assurances that there is no retention and no training happening. From my American perspective Chinese companies don't have the best track record of not training on proprietary information. I guess LLMs in general are trained on a lot of proprietary information. I just don't want to be responsible for unintentionally exfiltrating my employer's proprietary code.
My recent frustration with Claude has been it feels like I'm waiting on responses more. I don't have historical latency to compare this with, but I feel like it has been getting slower. I may be wrong, and maybe its just spending more time thinking than it used to. My guess is Anthropic is having capacity issues. I hope I'm wrong because I don't want to switch.
There was a really good point in this podcast episode about the speed of LLMs. They are so slow that all of the progress messages and token streaming are necessary. But the core problem is that the technology is so darn slow.
As someone who both uses and builds this technology I think this is a core UX issue we’re going to be improving for a while. At times it really feels like a choose 2+ of: slow, bad, and expensive.
I feel like there have been enough hyperbolic claims by Anthropic, that I'm starting to get some real Boy Who Cried Wolf energy. I'm starting to tune out, and assume it is a marketing ploy. Trust me, I'm an Antropic fan, and I pay my $200/month for max, but the claims are wearing thin.
Thank you. I have the same card, and I noticed the same ~100 TPS when I ran Q3.5-35B-A3B. G4 26B A4B running at 150TPS is a 50% performance gain. That's pretty huge.
I feel the global instability could easily be very disruptive to SpaceX. Just imagine if Russia gets vindictive and starts destroying these satellites or blowing up their satellites to create orbital debris that could knock satellites out of orbit. A really bad solar storm could be devastating.
Just saying there are some decent risks, and pricing it at 1.75T IPO seems risky enough. I would not take that gamble.
> imagine if Russia gets vindictive and starts destroying these satellites
Sounds like lots of demand for new launches from the military-industrial complex.
> imagine if Russia gets vindictive and starts destroying these satellites
Space is big. It’s almost always cheaper to individually target satellites than to try and blanket orbits. And with Starship vs ASAT, the cheap drones are the satellites. Russia would bankrupt itself trying to sink Starlink and Starshield.
(They would also set a precedent that would let the U.S. deny China a LEO constellation.)
> It’s almost always cheaper to individually target satellites than to try and blanket orbits.
The problem is that even one satellite could start the Kessler syndrome due to how many are currently in orbit, and the numbers are expected to keep increasing rapidly - everyone wants their "sovereign" Starlink now that it has been shown to be feasible and performant.
I think two recent advances make your statement more true. The new Qwen 3.5 series has shown a relatively high intelligence density, and Google's new turboquant could result in dramatically smaller/efficient models without the normal quantization accuracy tradeoff.
I would expect consumer inference ASIC chips will emerge when model developments start plateauing, and "baking" a highly capable and dense model to a chip makes economic sense.
Who will be funding state of the art local models going forward? AI models are never done or good enough. They will have to be trained on new data and eventually with new model architectures. It will remain an expensive exercise.
I could be wrong because I'm not following this too closely, but the open weights future of both Llama and Qwen looks tenuous to me. Yes, there are others, but I don't understand the business model.
If turboquant can reliably reduce LLM inference RAM requirements by 6x, suddenly reducing total RAM needs by 6x should have a dramatic shift on the hardware market, or at least we can all hope. I know 6x is the key-value cache saving, so I'm not sure if that really translates to 6x total RAM requirements decrease for inference.
There are many sources for data on before and after school cell phone bans. Oregon is far from the first to implement this. 35 US states have some form of school cell phone ban, and I believe the UK is doing a nation-wide ban. There is a good amount of supporting data measuring results on this topic.
For ~$60 you get a device that can play every type of audio file and has better sound quality than your cellphone + streamer combo.
I've been reading more about Chinese hardware and if you've been sleeping on it there are a lot of great Chinese consumer products that are both extremely high quality + very cheap.
Turns out when you have tens of millions of engineers they pump out banger after banger. Also always hilarious, in an enduring way, finding the factory engineers engaging with consumers on random forums that take their feedback seriously.
Note that in this case, you are getting what you pay for: I had a FIIO DAC that sounded amazing but was really bad about full-scale turn-on, sync and desync pops to the extent that it damaged my speakers. Yes, perfect power sequence hygiene would have prevented the problem, but one can't always be ready with the amplifier volume knob when their playback system crashes.
ah good to know. Outside of having a very basic dac for my cans on my desktop, I wouldn't think of any serious equipment failures could happen. Probably wrong to assume that these things are engineered to be safe/redundant.
This is going to be my first DAP in like 15 years, zune being the last one I had. Pretty excited to rock it out for a bit.
There's a current fad out there to move to more single-service type of devices rather than using a phone for everything. Want to try it out myself to be more intentional with my digital actions and ween myself away from corporate social media.
If they're allowed and help where phones wouldn't or don't there are still lots of options for stand alone MP3 players with minimal or no connectivity. They still exist as a market because they're dirt cheap to make.
This is what I've been working on. I've written a project wrapper CLI that has a consistent interface that wraps a bunch of tools. The reason I wrote the CLI wrapper is for consistency. I wrote a skill that states when and how to call the CLI. AI agents are frequently inconsistent with how they will call something. There are some things I want executed in a consistent and controlled way.
It is also easier to write and debug CLI tooling, and other human devs get to benefit from the CLI tools. MCP includes agent instructions of how to use it, but the same can be done with skills or AGENTS.md (CLAUDE.md) for CLI.
reply