Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Regarding #3, we still have so far to go to get to human-level intelligence that I think companies benefit more from open research than closing their doors. When a company publishes their research, others can find and publish improvements to their research. Then, the company can use the published improvements to improve their product.


Have you ever tried to reproduce research from academics and or companies ? Some papers take more than a day of work to reproduce . Not only that if you publish the research without the dataset it’s practically impossible or extremely expensive . Doing GPT-2 from scratch is around $50k or compute time and data collection.

Even a simple deep learning paper will require you to have at least a NVIDIA 1080ti if you are lucky and for NLP I needed to buy a RTX Titan ($2500 graphics card).


Some papers take more than a day of work to reproduce

A large part of my job is this. Generally reproducing a paper with no code is months of work.

Doing GPT-2 from scratch is around $50k or compute time and data collection.

It's extraordinarily rare you need to do this though. I know a few who have (mostly for foreign languages) and all have been able to access TPU grants from Google or multi-node GPU clusters (which are pretty easy to find if you work in the field - plenty of vendors want someone to test their "supercomputer" for a a week)

Even a simple deep learning paper will require you to have at least a NVIDIA 1080ti if you are lucky and for NLP I needed to buy a RTX Titan ($2500 graphics card

Most of my work is in NLP and I do a large amount of it on a 1070.


Yea GPT2 requires a bit more ram than on the 10 or 20 aeries TI cards .

My point I think was unclear was the issue of reproducing the paper from scratch (I guess the model is enough ?) not sure how those papers are peer reviewed unless they also send the dataset to the reviewers ?

I also was unclear was when I said reproducing I mean to just get a model that has already been pretrained. I agree with your points .


Peer review almost never means reproducing the model, but that isn't because of the dataset (which is usually available to the reviewer) but because that's not what a peer review is!

A peer review isn't an adversarial process where you think that the person has done something wrong. Instead, it's extra eyes on it to make sure they have thought of everything, and to say what additional tests might be needed and why.


This is a very interesting point! If this is true then it would mean that independent researchers will have a very tough time producing quality research without sufficient funding.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: