More

mattnewton · 2026-05-04T23:14:21 1777936461

They are betting they can sell the bag before the music stops.

mattnewton · 2026-04-30T16:31:22 1777566682

> or instance, maybe you can't afford to take on more customers right now, Anthropic. Maybe if you are severely undermining the customer relationships you already have, you should just admit you can't sell any more 20x plans right now and only accept new customers at lower tiers until you have the necessary capacity.

Or just increase prices for new claude code users? Surely transparent upfront across the board price increases are easier to swallow than hidden context-based pricing changes like this?

mattnewton · 2026-04-28T16:57:16 1777395436

Idk about morality, but it’s certainly a way to stop dystopian mass surveillance nightmares if everyone capable of building one refuses.

So if you live in the US and don’t want one government agency in the US to have this power (that is ambiguous under current law), one way you can try to avoid it is by refusing to sell it to them and urging others to do the same.

It’s a long shot sure, but it certainly seems more effective than hoping the legislature wakes up and reigns in the executive these days.

mattnewton · 2026-04-28T15:47:40 1777391260

My best guess is that for certain goods, people make purchasing decisions around major life events like getting married or becoming parents. If they have already crossed those thresholds those purchase patterns may be harder to unseat and replace, they may already be solving those new needs and have habits around existing brands.

My second guess is that DINKs have more disposable income.

quickthrowman · 2026-04-28T17:52:06 1777398726

Disposable income is Gross Income minus Taxes. A 2 parent family with 2 kids making the same as a DINK couple would have a lower tax burden and more disposable income, but the DINK couple almost certainly has more discretionary income, which money left after paying for necessities like housing, food, and medical care.

BobaFloutist · 2026-04-28T18:13:54 1777400034

On the other hand, a 2 parent family with 2 kids is quite likely to not be making the same as a DINK couple.

mattnewton · 2026-04-27T20:52:57 1777323177

people are trying, especially for inference. For training, it’s just too high risk to tank your training I think.

TPUs are at least dogfooded by Google deepmind, no team AFAIK has gotten the AMD stack to train well.

coder-3 · 2026-04-27T21:41:11 1777326071

Interesting. Why? My current mental model is that AMD chips are just a bit behind, so, less efficient, but no biggie. Do labs even use CUDA?

nl · 2026-04-28T00:15:11 1777335311

This is somewhat out of date (Dec 2024), but gives you some idea of how far behind AMD was then: https://newsletter.semianalysis.com/p/mi300x-vs-h100-vs-h200...

Pull quotes:

AMD’s software experience is riddled with bugs rendering out of the box training with AMD is impossible. We were hopeful that AMD could emerge as a strong competitor to NVIDIA in training workloads, but, as of today, this is unfortunately not the case. The CUDA moat has yet to be crossed by AMD due to AMD’s weaker-than-expected software Quality Assurance (QA) culture and its challenging out of the box experience.

[snip]

> The only reason we have been able to get AMD performance within 75% of H100/H200 performance is because we have been supported by multiple teams at AMD in fixing numerous AMD software bugs. To get AMD to a usable state with somewhat reasonable performance, a giant ~60 command Dockerfile that builds dependencies from source, hand crafted by an AMD principal engineer, was specifically provided for us

[snip]

> AMD hipBLASLt/rocBLAS’s heuristic model picks the wrong algorithm for most shapes out of the box, which is why so much time-consuming tuning is required by the end user.

etc etc. The whole thing is worth reading.

I'm sure it has (and will continue to) improved since then. I hear good things about the Lemonade team (although I think that is mostly inference?)

But the NVidia stack has improved too.

_vertigo · 2026-04-28T03:11:40 1777345900

That’s insane. There should be a big team of people at AMD whose whole job is just to dogfood their stuff for training like this. Speaking of which, Amazon is in the same boat, I’m constantly surprised that Amazon is not treating improving Inferentia/Trainium software as an uber-priority. (I work at Amazon)

chii · 2026-04-28T05:28:47 1777354127

> There should be a big team of people at AMD whose whole job is just to dogfood their stuff

if they had this management attitude, they wouldn't have been so far behind so as to need this action in the first place!

nl · 2026-04-28T05:51:37 1777355497

I'll just leave this here from 10 years ago:

> “Are we afraid of our competitors? No, we’re completely unafraid of our competitors,” said Taylor. “For the most part, because—in the case of Nvidia—they don’t appear to care that much about VR. And in the case of the dollars spent on R&D, they seem to be very happy doing stuff in the car industry, and long may that continue—good luck to them.

https://arstechnica.com/gadgets/2016/04/amd-focusing-on-vr-m...

"car industry" is linked to the GPU-accelerated self-driving car work, ie, making neural networks run fast on GPUs: https://arstechnica.com/gadgets/2016/01/nvidia-outs-pascal-g...

coredog64 · 2026-04-28T14:07:13 1777385233

Where's the scope for an L7 promo in "Fixed a bunch of tiny issues that were making it hard to use Tranium/Inferentia with PyTorch"?

Amazon's compensation strategy, in which you primarily get a raise years in the future for tricking your management chain into promoting you is definitely bearing its rotten fruit.

wongarsu · 2026-04-28T11:33:05 1777375985

Hardware companies being terrible at software is the norm. Nvidia is one of the rare companies that can successfully execute both.

Maybe Amazon is an example how this happens even to hardware divisions within software/logistics companies

Geezus_42 · 2026-04-28T14:05:15 1777385115

How are their Linux drivers looking these days? Still a PITA to install?

whywhywhywhy · 2026-04-28T10:23:19 1777371799

I mean the fact there isn’t even today may speak to why AMD isn’t the contender it should be by this point.

moritonal · 2026-04-28T08:32:53 1777365173

Anecdotal but over several years with an AMD GPU in my desktop I've tried multiple times to do real AI work and given up every time with the AMD stack.

calgoo · 2026-04-28T09:01:36 1777366896

Im running fine on my AMD 7800xt 16gb... Yes memory is a bit limited, but apart from the i have found that it works great using Vulcan in LM studio for example.

ROCm works great too, the only issue i have had is that my machine froze a couple of times as it used 100% of the graphics and the OS had nothing left. Since moving to vulcan i stopped getting these errors apart from a little UI slowdown when i had 4 models loaded at the same time taking turns.

Im also on a i7 6700 with 32gb DDR4 so im sure that is causing more slowdowns then the graphics card.

djhn · 2026-04-28T04:29:47 1777350587

Yet another reason to doubt claims that ”software is solved”.

Anthropic did retire an interview take-home assignment involving optimising inference on exotic hardware, because Claude could one shot a solution, but that was clearly a whiteboard hypothetical instead of a real system with warts, issues and nuance.

electroglyph · 2026-04-28T05:31:48 1777354308

i'm doing inference on a free mi300x instance from AMD right now. not sure if the software stack is just old or what, but here's what i've observed: stuck on an old version of vllm pre-Transformers 5 support. it lacks MoE support for qwen3 models. oss-120b is faaaar slower than it should be.

int8 quantization seems like it's almost supported, but not quite. speeds drop to a fraction of full precision speed and the server seems like it intermittently hangs. int4 quantization not supported. fp8 quantization not supported.

again, maybe AMD is just being lazy with what they've provided, but it's not a great look.

right now the fastest smart model i can run is full precision qwen3-32b. with 120 parallel requests (short context) i'm getting PP @ 4500 tokens/sec and TG @ 1300 tokens/sec

uberduper · 2026-04-27T23:47:04 1777333624

amd gpus compete but they lack the interconnect. NVLink performance is a huge deal for training.

bean469 · 2026-04-28T06:27:29 1777357649

> Do labs even use CUDA?

From the papers I've read and the labs that I have worked in personally, I would say that most scientists developing Deep learning solutions use CUDA for GPU acceleration

0-_-0 · 2026-04-27T21:45:38 1777326338

What I hear is that getting your network to work on AMD is a huge pain.

dnadler · 2026-04-27T23:22:21 1777332141

Yeah, historically it’s been software that’s limited AMD here. Not surprised to hear that may still be the issue. NVidia’s biggest edge was really CUDA.

otabdeveloper4 · 2026-04-28T12:18:41 1777378721

CUDA is a complete and utter piece of shit software. It's just that it is a tiny bit less of a shitshow than the alternatives.

f6v · 2026-04-28T09:39:10 1777369150

I don’t know what’s a chicken and what’s an egg here. But ROCm support is often missing or experimental even in very basic foundational libraries. They need someone else to double down on using their chips and just break the software support out of the limbo.

mlmonkey · 2026-04-28T14:20:28 1777386028

This is what I've heard on the "street". Building a CUDA-compatible stack for AMD's hardware requires highly-paid SWEs. It's a very niche field, and talent is hard to come by.

But AMD does not want to pay these specialized SWEs the market rate. Their existing SWEs would be up in arms saying, basically, "what are we, chopped liver??", or so the thinking goes.

So AMD is stuck with a shitty software stack which cannot compete with CUDA.

If I were making such decisions, I would just cull the number of existing SWEs down by 50%, and double the pay for remaining ones. And then go out and hire some top talent to build a good software stack.

square_usual · 2026-04-28T14:21:04 1777386064

> highly-laid SWEs

Freudian slip?

mlmonkey · 2026-04-28T15:36:56 1777390616

Ha! You caught it before I did; and I caught it right away.

mattnewton · 2026-04-10T17:27:30 1775842050

This exactly. I don’t believe the government should be censoring porn, but I have a really hard time arguing that principle against studies that suggest it is normalizing choking and slapping women among the young men exposed to it. Why is this roleplay fetish the beachhead and not something like that?

fiddeert · 2026-04-11T03:33:02 1775878382

That's an odd perspective. I have heard of young women demanding to be choked and slapped (with their partners acquiescing) far far more than young men instigating the behaviour.

mattnewton · 2026-04-11T20:18:18 1775938698

https://www.theguardian.com/lifeandstyle/2025/jul/07/no-safe...

> Now thought to be the second most common cause of stroke in women under 40, it can also lead to difficulty swallowing, incontinence, seizures, memory problems, depression, anxiety and miscarriage.

Looks like they are going after that too though.

mattnewton · 2026-04-10T15:33:44 1775835224

I’d hope that average American doesn’t care about jobs they aren’t qualified for being filled by people paying taxes into their communities.

mattnewton · 2026-04-04T19:40:25 1775331625

Still undesirable latency for a lot of compute use cases, like image or video editing; it’s really only negligible for LLMs.

Since that’s definitely a big enough use case all on its own, I wonder if such a product should really just double down on LLMs.

serf · 2026-04-04T21:14:05 1775337245

remote GPU compute payloads have been around a lot longer than LLMs, they're just few and far between.

folding@home and other such asynchronous "get this packet of work done and get back to me' style of operations rarely care much about latency.

Remote transcoding efforts can usually adjust whatever buffer needed to cover huge latency gaps , a lot of sim and render suites can do remote work regardless of machine to machine latency..

I just sort of figure the industry will trend more async when latency becomes a bigger issue than compute. Won't work in some places, but I think we tend to avoid thinking that way right now due to a lack of real need to do so; but latency is one of those numbers that trends down slowly.

mattnewton · 2026-03-17T19:03:23 1773774203

Like gravity, there is some inexorably force drawing the state towards mass surveillance tools as it makes the job easier. Removing friction that fights against that force is real

mattnewton · 2026-03-17T18:47:37 1773773257

Seems like a slippery slope. Now the infrastructure is there to ask apple, Google and microsoft to confirm identity with selfies over the internet.

ezfe · 2026-03-17T19:14:42 1773774882

That infrastructure is literally already there. It's done and live in some areas.