Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Using Coral CDN:

  wget -rl 0 -np -A .mp3 http://thenine.ca.nyud.net/essential/


Nogood, checkout the robots.txt


Since Coral considers itself a distributed caching proxy rather than a crawler, it doesn't respect robots.txt. It does respect the relevant cache-control headers, like "no-cache", but this site doesn't appear to set them.

However, the main problem here is that Coral doesn't cache any files >50mb, so almost none of these are cached. The few <50mb do seem to be, though, e.g.: http://thenine.ca.nyud.net/essential/1998/1998.01.01%20-%20E...


Yup, thenine.ca is blocking coral cache for .mp3 :(


-e robots=off

However aren't they copyrighted?!




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: