Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Yes! We need to see more material under Matt Might's CRAPL license. I feel a strong urge to write off any result that doesn't have accompanying code with it.

http://matt.might.net/articles/crapl/



Great idea, awful awful name - CRAP license, as in 'Community Research/Academic Programming' = CRAP. Why would your make your concept the butt of jokes from the outset?


It's right there on the page: "It's not the kind of code one is proud of." By releasing it as CRAP, you're acknowledging that the code is imperfect, reducing the perceived barrier. If academics feel like you can only release code if it's pristine, it'll never get released, because for the most part, the incentives aren't there to make clean code in an academic setting.


This is to badly miss the point. What matters in this context is accuracy, not elegance.

If I brute-force detection of the first 10000 prime numbers (for some non-CS context) by testing their divisibility from n to 1 it's horribly inefficient and programmers may laugh, but it's accurate, simple, and easily reproducible. Given that a definition of a prime number as divisible only by itself and 1, it may be better to present a brute-force algorithm than to get into a sideline discussion about validating some new-fangled method like the Sieve of Erasthostenes, that upstart.

Science aims at proof rather than efficiency. What matters is the quality of the result, and its reproducibility. If you can get to the same result in a much more timely fashion that's awesome, but that's an engineering achievement rather than a scientific one. I don't think that anyone wants to put out code labeled as CRAP in an attempt to mollify engineers.


Have you seen typical research code? It's crap. If you looked at it, your first reaction would probably be "Wow, this is crap." It's created in a hurry, usually grows by accretion, and the people who write it are usually pretty half-assed at programming, having made the solidly practical decision to focus more on the other aspects of their field. The commenting is usually lousy-to-nonexistent, the indentation is frequently screwy, and everything about it just screams "I am a temporary hack, written only to get publishable results."

And that's to be expected, because of how most scientific code gets written, and what the incentives are. And releasing this code would still be strictly better than not releasing it. We just need to keep the expectations of code quality low, which the CRAPL license does.


Only programmers have high expectations of code quality. If I'm a biologist, my target audience is other biologists, not programmers.

Your comment reads to me like you want non-programmers to ritually humiliate themselves by labeling their stuff as CRAP before you will deign to parse it for correctness. Patronizing people is not a good way to get them on-side.

We just need to keep the expectations of code quality low, which the CRAPL license does.

The purpose of the CRAPL is to make code accessible, not to lower expectations. It's not about what you think of the code, it's about whether the code yields correct results.


How to Design Programs is a great text in general, but I think it holds particular power for researchers. I still write some janky code, but particularly when I'm doing R, I find myself falling easily into the HTDP mindset.

Not to dredge up a functional vs imperative battle, but I feel FP has a lower impedance mismatch with mathematical concepts generally.


Acronyms don't have to be positive to be successful. CRUD is an example of this.


CRUD is often used as a pejorative though.


The name is a feature, not a bug.


Perhaps, but is this reflected by widespread adoption of the license? If not, then maybe it is actually a bug.


I wonder how hard it would be to find CRAPL (or otherwise open-sourced) research code that one would be able to contribute to (as a software engineer) in order to clean it up yet preserve function. (You know, add tests, refactor, etc.) I assume that a lot of that would depend on the public availability of test data to exercise the code with, but it's an interesting idea.


That's an interesting question. I'd guess that most code born of academia has a rather short half life in terms of immediacy for the author. However, the benefit of having any sort of second pass look at anything from a software engineer's perspective would be a major boon.

You might have better luck in a quasi-research public venue. For example, I work at a public agency that uses your tax dollars to forecast travel demand, and then use the results of that statistical modeling (plus a thick shmear of political wrangling) to decide how to spend more of your tax dollars.

Much of this model code is developed as part of an honest to goodness research process (yay!) by contract software developers that public agencies can afford (not so yay). In other words, things like revision control and unit tests are mostly dismissed as extravagances and unwarranted delays. Validation that is performed doesn't exactly inspire confidence.

I'm betting there are a million and one of these kind of projects. Some of us are trying to figure out internal and external issues so we can post this sort of thing on places like GitHub. Others are already there.

If you pick a topic you're interested in, and ask the right people, it'll probably be worth some authorship cred.


I don't know how seriously this license should be taken. I've found at least one typo (in 'tested or verfied'), and a few ambigous statements, for example: 'You make a good-faith attempt to notify the Author of Your work'.

Does that mean to 'notify the Author about Your work', or 'notify the (Author of Your work)' (i.e. yourself)?

If this license is meant to be serious then I should probably notify its author about these concerns.


Unfortunately the license terms are decidedly non-free and antithetical to the released software being used elsewhere.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: