Except that a biochemist learning Python to solve problems he/she has in protein folding research (or whatever) is a useful skill to get things done in his/her field. A programmer learning about basic lab equipment probably isn't going to help too much in developing software (although it might help him/her get more attractive dates).
And protein folding is much more useful to humanity than compiler optimization. Does anyone even work on compiler optimization anymore?
If it wasn't for the programmer creating languages that make massive, fast computation accessible to the biochemist, there would be no Python for the biochemist to use to do their research. The programmer is just once removed from the "important to humanity" aspects.
And if it wasn't for the programmers who get excited about languages and pioneer different uses of them, the biochemist would most likely never even know there is this tool available for them to use.
Then we are in the same club. I've published research on global optimization algorithms for protein folding. I'm not a biochemist, but I understood that protein folding actually has immediate application in understanding disease and drug design.
As a computer scientist, I also understood that compiler optimization is a mature field with most of the low-hanging fruit already picked. So, I guess I'm confused and will ask respectfully what problems in compiler optimization make it a thousand times more useful than protein folding and associated medical problems?
"I'm not a biochemist, but I understood that protein folding actually has immediate application in understanding disease and drug design....what problems in compiler optimization make it a thousand times more useful than protein folding and associated medical problems?"
Short answer:
Compilers are used for real work, every day. Nobody is using protein structure prediction for anything practical, and they likely won't be for decades more. At this point, it's blue-sky research.
Long answer:
"Immediate application" is one of those bits of academic-speak that really means "is related to", but sounds better to grant review boards. While it's true that protein folding is important (after all, most biological processes are mediated by folded proteins), it's not true that protein structure prediction is important. It would be great if we could predict protein structures accurately, but we can't, and until we can, it's not a practically useful discipline.
Even the very best, crystallographically determined protein structures are barely sufficient to do rational drug design, and predicted structures don't come close to that level of quality. For example: we can sometimes (very rarely) predict very small (<150 residue) protein structures to within 1 angstrom RMSD of their experimentally determined shapes (i.e. >2 angstrom resolution, in the best case). However, the interactions important to drug binding, protein design, etc., don't start until a tenth of that (scales of ~0.1 angstrom).
Throw in the fact that the vast majority of proteins are much larger than 150 angstroms, and that we keep creating cheaper, faster, more automated ways of getting actual experimental information on structure, and the role of protein structure prediction looks increasingly marginalized. It's definitely a cool, fun problem -- just not a very practical one.
For whatever it's worth, my first papers were on applying the state-of-the-art method (you've heard of it...I think you're paraphrasing the lab's PR) for protein structure prediction to genome annotation. To call the approach useful was/is a stretch, and that's for a much easier application than drug design (in fact, we were trying to find a practical application for protein structure prediction, and it was the most likely thing we could think of!)
Thanks for the very detailed answer. Amazingly, I can follow the gist of it after over 10 years.
But, you still don't answer the question. What problems in compiler optimization are more important than problems in protein folding? You seems to indicate that protein folding is a basic science problem and not a "practically useful discipline". In fact, your statement "It would be great if we could predict protein structure accurately, but we can't, and until we can, it's not a practically useful discipline" says that the because the problem isn't solved, it's not important, but it will be important when it's solved. So, trying to solve the problem is important, no?
But, you don't say anything about compiler research, and specifically compiler optimization research and development, which you claimed is much more important. What specific areas in compiler optimization (or just in compiler design) are more important than protein structure prediction and modeling?
I did answer your question, but now you're asking a different one. I have no idea how "important" protein structure prediction will ultimately become; I just know that it's currently pretty useless, and getting worse.
My first comment was that compiler optimization is about a thousand times more useful than protein folding. I stand by that remark. However, at the beginning of my long answer, I mistakenly wrote that protein structure prediction is not important, when I had meant to write that it is not useful: protein folding is an important biological process, but protein structure prediction is not particularly useful, for the reasons I've mentioned.
It's not my place to say which area of research is more important. That's a subjective question, and the answer depends on your value system, your outlook, and your willingness to wait. Obviously, I think that compiler optimization is more useful, because compilers are actually in use today. In 100 years...who knows?
That said, I think you're laboring under the assumption that compiler optimization is a "mature" field, and that it is "solved" (and therefore less important), whereas protein folding is not "solved" (and therefore more important). The thing is, people have been doing protein folding research for at least fifty years -- it is a very mature field, and the low-hanging fruit has been picked. I think that a new researcher is equally likely to make significant gains in either field, but that the potential for practical impact is still much greater in compiler design.
I'm asking the same question, and you're not close to answering it. My question was "what problems in compiler optimization make it a thousand times more useful than protein folding and associated medical problems?" because you authoritatively stated that protein folding work (implicitly computational protein folding work) was much less important/useful (pick one) than compiler optimization work. You keep talking about protein folding research, but you don't say anything about compiler optimization. Was that comparison just an off-hand or self-deprecating remark about protein folding work? I'm not asking which area is more important. And, I know enough about both fields to not get lost in the technical details of your answers. So, I'll ask it again: what problems in compiler optimization make it a thousand times more useful than protein folding and associated medical problems?
I didn't mean for people to focus on "protein folding", that was just an example of something that scientists do that's computationally intensive.
I find myself in my early 30s knowing an awful lot about database applications, but zero domain knowledge. I can go into any field and implement a spec, but I don't understand any of it. Someone wants a graph of this data in their application, I'll give them a great graph, but I look at it and it's just squiggly lines to me. I just feel like I'm missing something.
Wow. This is really enlightening; I thought that the computational method was going to open a new phase in disease treatment, but you seem to say here that the empirical method is on its way to making it useless. So the Pande group at Stanford is wasting their time. Interesting.
"So the Pande group at Stanford is wasting their time."
I wouldn't go quite that far. The research is definitely speculative, but lots of interesting things can come from speculative research. My point is that you don't do research into protein structure prediction with the intent of finding anything useful. It's basic science.
We can (and occasionally do) learn things from computer models of proteins. But the PR in this field has been seriously exaggerating the results of a few of the more prominent researchers. We're a long way from curing diseases or designing drugs with this stuff.
Is it the weakness of the modeling or the lack of computational horsepower that limits the research in this area? And would you mind linking to your papers?
That's a matter of debate. Some people think that the problem is search limited, others think that the current models are bad. In my opinion, the bulk of the evidence supports the latter conclusion.
Backchannel me, and I'll be happy to provide you with references to the papers I wrote/helped write. Most of them aren't open access, unfortunately.
That would be good advice to your brother. Programming as a career seems like a dead end going forward, for whatever reasons. However, if he is in some other seemingly unrelated occupation (law, medicine, education, etc.), then programming skill could be a competitive advantage.
If you can bring a pretty woman with you, it works even better. My wife and I visited the MIT campus a few years back, and she attracted more attention to us than I would have alone. She knows next to nothing about technology, but it didn't matter. Note, I'm not trying to knock MIT girls, just giving you a tip to get into some labs. Also, MIT folks are very nice and open, not elitist at all, in my experience.
When I read "fundamentally hard", I think NP-hard. What is NP-hard about twitter? Am I missing something or is this a reference to a systems development issue?
Realtime and near-realtime is hard in practical terms, not mathematical terms. I work on the latter, and some days I miss working on the former.
If it were easy to get it right, and profitable to execute correctly, I have to imagine that a competitor would have done so by now. But given the time and resource constraints, it seems that the problem remains difficult to manage, and I've always enjoyed that sort of challenge.
Developing heuristics for increasing statistical power, robustness, or informativeness is a mostly cerebral pursuit. Operations is more bloody in-the-trenches mud wrestling. And sometimes I look out from the ivory tower and I miss it.
That's hard to say. One can make the argument for a price crash as speculators have flooded into the oil and commodities markets and bid up prices. There is a lot of fear of peak <insert finite natural resource>. Once people got tired of it, prices would drop.
On the other hand, oil is a finite natural resource that has been pumped out of the ground for decades at an increasing rate. Only severely retarded politicians don't believe in peak oil or believe that oil is an infinite resource (they also believe in ghosts, so there). Thanks to China, India, and other countries deciding they want some basic level of civilization, the rate of increase in the rate of increase of demand means oil supplies and proven reserves are going down fast. That means there might not be a price crash and in a few years we will look at $6/gallon for gasoline as the good ole days.
Everything is finite. It all depends on /how/ finite things are. Technological advances, increases in efficiency, and the influence of competing technologies (electric, hydrogen, liquid propane gas, and so forth) will all have a role to play.
It is typically very hard to accept that these things can have a big effect, but that doesn't mean they won't (nor, of course, that they must, but the odds are better that they will). The price of oil will be as "immaterial" within 50 years as the prices of horses or gas lamps were immaterial in 1940 compared to 1890.
Java is going, going, ...