> Yes, in C/C++ heap allocations are proportionately pretty expensive, but I've seen Java programs have just ridiculous amounts of implicit allocations but there really isn't much of a problem.
Java programs make "ridiculous amounts of implicit allocations" because allocations are cheap in Java. And they need to be cheap because Java doesn't have value semantics so it leans hard on escape analysis + cheap allocations.
I agree with the rest of your comment, although I think most of Rust's "cognitive load" amounts to borrow-checker-vs-garbage-collection. You could envision a Rust with explicit allocations and a GC, and that language would have a "cognitive load" approaching that of Go while also being a fair bit more performant insofar as people can much more easily reason about allocations and thus performance.
> Java programs make "ridiculous amounts of implicit allocations" because allocations are cheap in Java. And they need to be cheap because Java doesn't have value semantics so it leans hard on escape analysis + cheap allocations.
Yes, but that's kind of the point, right? Implicit allocation isn't really a problem because a runtime that optimizes the allocations magically for you is a lot easier to build than a runtime that optimizes whether you really need to be copying objects as much as you do.
> Implicit allocation isn't really a problem because a runtime that optimizes the allocations magically for you is a lot easier to build
As far as I know, Java's (default) runtime gives cheap allocations at the cost of long GC pause times.
> than a runtime that optimizes whether you really need to be copying objects as much as you do
It's not "copying", it's "allocating", and avoiding allocations isn't that much work (and frankly I'm surprised it's such a minor problem that no one has bothered to build an IDE plugin that highlights these allocation points automatically--or at least I haven't heard of such a thing). Anyway, "a runtime that minimizes allocations" is just an escape analyzer and Java has one of these too, and IIRC it's a lot more sophisticated than Go's (but it's also a lot harder to reason about as a consequence).
> As far as I know, Java's (default) runtime gives cheap allocations at the cost of long GC pause times.
"long GC pause times" is kind of vague, so I guess you could be correct, but in practice there's a LOT of different ways the memory management can be handled, many of which are deemed "pauseless GC" (though the term is somewhat misleading).
My statement was considering that reality though. While not true for some use cases, in the vast majority of cases, the runtime optimizes the allocations more than sufficiently.
> It's not "copying", it's "allocating"
Allocators can do a pretty good job of minimizing the overhead of allocation, to the point the amortized cost isn't much more than a single machine instruction. Allocating gigabytes of memory quickly is possible. Copying the data can be a lot more work, and often objects have copy semantics that add a lot more additional work.
> Anyway, "a runtime that minimizes allocations" is just an escape analyzer and Java has one of these too, and IIRC it's a lot more sophisticated than Go's (but it's also a lot harder to reason about as a consequence).
I think you're implicitly saying "a runtime that minimizes heap allocations" there, in which case I'd agree.
> in practice there's a LOT of different ways the memory management can be handled, many of which are deemed "pauseless GC" (though the term is somewhat misleading).
Yes, but I'm pretty sure those "pauseless GC" schemes impose other tradeoffs.
> My statement was considering that reality though. While not true for some use cases, in the vast majority of cases, the runtime optimizes the allocations more than sufficiently.
I'm not sure I follow. The same could be said for Go--in the vast majority of cases, Go's tradeoffs (slow allocations, low latency / non-moving GC) are also suitable.
> Allocators can do a pretty good job of minimizing the overhead of allocation, to the point the amortized cost isn't much more than a single machine instruction.
As far as I know, speeding up allocations to this degree requires a moving GC which imposes a bunch of other constraints (including copying a bunch of memory around).
> Allocating gigabytes of memory quickly is possible. Copying the data can be a lot more work, and often objects have copy semantics that add a lot more additional work.
Yes, but the bottleneck here wasn't the copying, it was the allocations. And if you optimized away allocation cost entirely such that only the copy cost remained, that cost would be so small that the OP would never have bothered to profile because copying small objects like this is so cheap compared to everything else (even if it is expensive compared to bump allocating).
> I think you're implicitly saying "a runtime that minimizes heap allocations" there, in which case I'd agree.
Yes, the allocator and GC are concerned with heap allocations and not stack allocations. I'm using "allocations" as a shorthand for "heap allocations".
In hindsight, I think I chose how to present this poorly, because yes, in this case, the allocation is what is killing the performance. I look at it, and I just see unnecessary implied behaviour creating a performance problem. Usually it isn't the allocations themselves that kill you, but it certainly is the case here.
I agree with you (I think?) that the implicit allocations are a pain point. I think in the Go case, it is the allocations that kill you most of the time (at least that's the case for me), but in C++ it's more likely to be expensive copy constructors or destructors or so on.
Java programs make "ridiculous amounts of implicit allocations" because allocations are cheap in Java. And they need to be cheap because Java doesn't have value semantics so it leans hard on escape analysis + cheap allocations.
I agree with the rest of your comment, although I think most of Rust's "cognitive load" amounts to borrow-checker-vs-garbage-collection. You could envision a Rust with explicit allocations and a GC, and that language would have a "cognitive load" approaching that of Go while also being a fair bit more performant insofar as people can much more easily reason about allocations and thus performance.