Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

That’s a good idea,but google sometime crawls without the google user agent. So that’s not going to be 100 percent foolproof.

You’d be better off just blocking all of the ip addresses that google crawls from. There are lists of those out there.

When I used to cloak website content, and only serve up certain content to google, the only reliable way was to use ip cloaking. Because google crawls using “partners”, such as using Comcast IPs.

So if you’re want to really get your site out of the index, serve up the page with a noindex tag or noindex in the server header based on google ip addresses.



Hey! Googler here!

We don't use our hardware located on partner networks to do indexing. Those machines are searching for malware and serving some YouTube videos and Play Store downloads.


You forgot to add the word "currently"


"Because google crawls using "partners", such as using Comcast IPs."

Is this different than when others use proxies to evade access controls.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: