You can sell the old, less efficient GPUs to folks who will be running them with markedly lower duty cycles (so, less emphasis on direct operational costs), e.g. for on-prem inference or even just typical workstation/consumer use. It ends up being a win-win trade.
Building a new data center and getting power takes years to double your capacity. Swapping out out a rack that is twice as fast takes very little time in comparison.
Depends at the rate of growth of the hardware. If your data center is full and fully booked, and hardware is doubling in speed every year it's cheaper to switch it out every couple of years.