Interesting. If you have some pointers for something to read that discusses why special relativity requires locality, I'd love to read it. I have no real idea where to start searching.
Edit: Just to be clear, I'm aware that Bell's theorem says that QM must either break locality or realism, but I don't really understand why it can't break locality. While incredibly inconvenient, wouldn't that solve the problem? Again, I realise I'm naive, so I don't actually suppose my line of reasoning is correct ;-)
Special relativity is essentially an explanation of why the speed of light is a constant regardless of how you measure it. That is, if you're in a train moving at half the speed of light relative to the ground, and someone fires a laser in the same direction as the train from the last station you passed through, that laser beam will move towards you at at the speed of light. If you fire a laser back, it will reach the station at the same time as the laser from the station reaches you (as seen by an observer in the station).
This makes no sense unless the speed of light is a fundamental physical constant, so that motion in general depends on the speed of light, which is what special relativity postulates.
Now there are ways to have a special kind of non-locality that do not violate special relativity - you can have phenomena that happen at infinite speed, but only if they do not carry mass or energy or any information at all. The common interpretation of wave-function collapse is an example of such a phenomenon.
I'd also note that the famous E=mc^2 is also a limit on speed, since kinetic energy (mv^2/2) is part of the total energy of an object.
There are interpretations of quantum mechanics that give up on locality, most notably the Pilot Wave Theory[1]. It does work, and it is compatible with relativity.
I think that may be the reason it's not very popular: ok, so we've got these faster-than-light pilot waves, but we can't actually use them to do anything faster than light. They're just there for bookkeeping. (That said, Many Worlds suffers from the same problem, but it's very popular. They're two different ways of slicing up the same equation. You pick whichever one suits you.)
Physics is trying to fit reality to an equation, it is not reality itself. We don't know what an atom "is", we just know how it behaves with high precision.
If the simplest and most consistent math is a non-physical pilot wave, I don't think this really matters if it lets you calculate something more easily or correctly. I don't personally know how to use them (my five QM courses used traditional techniques) but if they give useful results it hardly matters if they're "real".
My good friend did his undergraduate thesis by noticing that Clebsch–Gordan coefficients could be used to describe grain boundary orientations in polycrystalline materials. Doesn't mean grain boundaries have spin. It's just math that was convenient and worked well.
There's a lot to be said for shutting up and calculating. If I were a physicist, I might subscribe to that myself. Since I can't calculate myself, I try to remain agnostic even to that extent.
That said, physics advances do sometimes come from asking "What if X is real?" The positron and electron spins are both poster children for that. Instead of just shutting up and calculating, people focused on the part of the calculation that seemed to imply the existence of an unobserved thing. We could, in fact, have kept going with a physics in which positrons were merely calculation conveniences; that physics is valid. But we might not have discovered the Standard Model that way.
So I'm of two minds... and in a lot of ways, I'm not really entitled to be of any minds, since my formal education stopped at undergrad, and I'm no longer capable of doing even that much math. I get leery when people with even less education want to "understand" without doing any of the math, because I fear that the best of explanations will only mislead them.
I'm not sure I understand why you see this as a dichotomy. Sometimes inspiration comes from a weird idea, sometimes it falls out of mathematical analysis.
It's not like it is exclusive, everyone thinks a bit different thankfully. Like your example of the positron and electron seems fine; math and experiment in a cycle of discovery. You wouldn't know to look for a positron if you didn't study the electron experimentally and try to come up with some math for it.
Contradictions are inconsistency in the theory, i.e. the theory can give different results depending on how you compute. To evade this you need to apply abstract reasoning outside of theory to decide how to compute in every situation. This means theory doesn't work by itself, i.e. it's not an objective theory. Also by realism Bell means hidden variables, not realism at large.
Edit: Just to be clear, I'm aware that Bell's theorem says that QM must either break locality or realism, but I don't really understand why it can't break locality. While incredibly inconvenient, wouldn't that solve the problem? Again, I realise I'm naive, so I don't actually suppose my line of reasoning is correct ;-)