> At the end of day they only increased performance by 50%.
> only 50%.
I'm sorry... what?! That's a lot of improvement and will save you a lot of money. 10% increases are quite large!
Think about it this way, if you have a task that takes an hour and you turn that into 59 minutes and 59 seconds, it might seem like nothing (0.02%). But now consider you have a million users, that's a million seconds, or 277 hrs! This can save you money, you are often paying by the hour in one way or another (even if you own the system, your energy has cost that's dynamic). If this is a task run frequently, you're saving a lot of time in aggregate, despite not a lot per person. But even for a single person, this is helpful if more devs do this. Death by a thousand cuts.
But in the specific case, if a task takes an hour and you save 50%, your task takes 30 minutes. Maybe the task here took only a few minutes, but people will be chaining these together quite a lot.
You have to ask yourself, 10% of what? I don’t usually mind throwing 10% more compute or memory at a problem but I do mind if its 10x more. I’ve shipped 100x perf improvements in the past where 1.5x would have been a waste of engineering time. A more typical case is a 10x or 20x improvement that’s worth a few days coding. Now, if I’m working on a mature system that’s had tens of thousands of engineering hours devoted to it, and is used by thousands of users, then I might be quite happy with 10%. Though I also may not! The broader context matters.
Sure, but I didn't shy away from the fact that it is case dependent. In fact, you're just talking about the metaoptimization. Which for any optimization, needs to be considered too.
Maybe these optimizations benefit the two users who do the operation three times a year.
In such an extreme case no amount of optimization work would be profitable.
So the parent comment asks a very valid question: how much total time was saved by this and who asked for it to be saved (paying or free tier customers for example)?
People who see the business side of things rightfully fear when they hear the word "optimization", it's often not the best use of limited development resources - especially in an early stage product under development.
I do wish that when people write about optimization that they would then multiply by usage, or something similar.
Another way is to show CPU usage over a fleet of servers before and after. And then reshuffle the servers and use fewer and use the number of servers no longer needed as the metric.
Number of servers have direct costs, as well as indirect costs, so you can even derive a dollar value. More so if you have a growth rate.
> I do wish that when people write about optimization that they would then multiply by usage, or something similar.
How? You can give specific examples and then people make the same complaints because it isn't relevant to their use case. It's fairly easy to extrapolate the numbers to specific cases. We are humans, and we can fucking generalize. I'll agree there isn't much to the article, but I find this ask a bit odd. Do you not have all the information to make that calculation yourself? They should have done that if they're addressing their manager, but it looks like a technical blog where I think it is fair to assume the reader is technical and can make these extrapolations themselves.
I hear you. However, I have one rule of writing: assume the reader is lazy. It is not that they are, but the assumption goes a long way in making content digestible.
Also, I think knowing the combined effect is super interesting. For example, micro-benchmarks are fun to use and see improvements, but I also want to know the effect on the whole program.
I do wonder though if assuming the reader is lazy is the best. Especially in technical posts. I think there is a difficulty in balancing forcing the person to digest what you say and making it approachable (especially when you consider a noisy audience). It is a natural filter, is that good or bad? Guess depends.
Agreed about the microbenchmarks and scale. Things don't always scale as expected. But I think there are a lot of variables here so it might be difficult to portray an accurate expected result. Though I can see this being worthwhile for anyone wanting to build RAGs or process lots of text into some embeddings. Also looks like the project is still under active development and started 6 months ago (single dev?) so I'm not sure we should expect to see too big of scale: https://github.com/bosun-ai/swiftide
So idk, that seems like exactly the kinda thing HN should be well suited for: new projects where people are hacking together useful frameworks. But idk, I guess if YC is funding companies who's business model is to fork an OSS then the bar might be lower than I think. But I thought we were supposed to be hackers (not necessarily crackers) ¯\_(ツ)_/¯
> So the parent comment asks a very valid question: how much total time was saved by this and who asked for it to be saved (paying or free tier customers for example)?
That is a hard question to answer because it very much depends on the use case, which is why I gave a vague response in my comment. Truth be told, __there is no answer__ BECAUSE it depends on context. In the case of AI agents, yeah, 50% is going to save you a ton of money. If you make LLM calls once a day, then no, probably not. Part of being the developer is to determine this tradeoff. Specifically, that's what technical managers are for, communicating technical stuff to business people (sure, your technical manager might not be technical, but someone being bad at their job doesn't make the point irrelevant, it just means someone else needs to do the job).
You're right about early stage products, but there's lots of moderate and large businesses (and yes, startups) that don't optimize but should. Most software never optimizes and it has led to a lot of enshitification. Yes, move fast and break things, but go back and clean up, optimize, and reduce your tech debt, because you left a mess of broken stuff in your wake. But it is weird to pigeonhole to early stage startups.
Think about it this way, if you have a task that takes an hour and you turn that into 59 minutes and 59 seconds, it might seem like nothing (0.02%). But now consider you have a million users, that's a million seconds, or 277 hrs! This can save you money, you are often paying by the hour in one way or another (even if you own the system, your energy has cost that's dynamic). If this is a task run frequently, you're saving a lot of time in aggregate, despite not a lot per person. But even for a single person, this is helpful if more devs do this. Death by a thousand cuts.
But in the specific case, if a task takes an hour and you save 50%, your task takes 30 minutes. Maybe the task here took only a few minutes, but people will be chaining these together quite a lot.