Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

0. The zeroth rule of SEO is get your site listed for a search for your site. If your site is bobsfishingtips.com, make sure if someone searches for Bob's Fishing Tips, you get found. This means simply getting at least one real site to link the name of your site to you, or maybe a couple sites if you have some common word like Yelp.

1. After that, make sure people link to you with proper anchor text for other keywords. If you want people to search for "fishing tips" and find you, then several people will need to link you something like this:

    This site has great <a href=http://www.mysite.com>fishing tips</a>.
2. The more authority these links have, the better you will do. If you get a very high PageRank site to link "fishing tips" to you, you might be immediately first for that query in Google. Or, 2-3 medium links might do it.

3. The words you want people to search for need to be used several times on your site. You should have the words "fishing tips" on several pages and you should link to your best page on "fishing tips" by putting that phrase in your own links to your own pages.

4. Also make sure you put the keywords you want searches for in the title of your page, and enclose them in H1 tags or other bold/header tags. This won't help very much, but it's probably worth doing.

5. Links don't really help you unless they are from a real domain - a link from someblog.blogspot.com will not help your PageRank much at all. Also, the domain needs to exist for a while to help you - something like a year.

6. Good places to get links are from your college and high school newspapers, local newspapers, and anyone else who has a website that would appropriately cover you.

7. Some sites have way more PageRank than you might expect. www.cmu.edu is an anchor site for the link graph, and a link from this site will do wonders for your PageRank.

8. Here are some excellent pages on SEO:

    http://www.localseoguide.com/yelp-seo-analysis-part-one/
    
    http://www.localseoguide.com/yelp-seo-analysis-part-two/
You should also check out Mahalo.com. That site is SEOed within an inch of its existence on Google, so take some tips from them but tread carefully.

9. In the end, it really boils down to having authoritative links with the right anchor text linking to you. The rest matters very little.



All good advice. I'd add:

1 Speed matters. Google is on record about this. They want your site to load quickly. This factor is part of the 'caffeine' update that Google is in the process of rolling out http://searchengineland.com/site-speed-googles-next-ranking-...

2 Don't stop with H1s. Think about the structure of your page. Subheaders should be H2, and so on.

3 Use Google Webmaster Tools to submit a sitemap. They'll regularly pull it down from your server so as you add pages to your site, Google will know to index them. They'll tell you how many pages you have submitted, and how many are indexed.

While you're in Webmaster Tools, check out if your site has crawl errors. Check out the HTML suggestions. Google will point out the pages which are missing title tags, which have duplicate tags, short descriptions, etc. Lots of other information in there, including page load times.


You don't need a site index unless you have a strange site. If you have text/HTML and you are linked from other sites, the site map won't matter.


A site map is an excellent way to make sure that it is your content that makes it in to google rather than someone that 'clones' you.

It can give you a few days lead over the cloners and that's sometimes all you need.


I call bullshit. Have you actually ever had someone clone your site or tried to protect against it this way? I am all ears for your story... I have never heard of anything like that.

Setting up a sitemap is usually just a placebo for finicky webmasters, and it's a waste of time to boot. The biggest sin at a start-up is wasting time, and this easily qualifies.

The only time you want a sitemap is when you don't have a text front-end, and even then, you are way better off mimicking your JavaScript with a text layer for the search engines that implementing some sitemap.

Even submitting a page in a sitemap doesn't ensure it gets crawled, and it certainly won't get crawled if you have a large site and don't set up your sitemap just right.


> I call bullshit.

That's cool with me.

> Have you actually ever had someone clone your site or tried to protect against it this way?

Yes, otherwise I wouldn't have written that, now wouldn't I?

> I am all ears for your story... I have never heard of anything like that.

That you haven't heard of something is not a reason to assume that it doesn't exist, there are actually more things that you probably haven't heard about that aren't bullshit either.

In fact, there are probably lots of those things.

I recently did a fairly large project, well documented here on HN and in the press where I did just that. Some less than honorable characters were cloning the data as fast as it hit the server and turning it into 'mfa' fodder, so at some point I shut that down and submitted the site to google which then crawled it at it's leisure.

Even today that crawl isn't 100% complete yet, and the only way you can reach those pages right now is by going through the google index.

Long term I expect plenty of those pages to be linked again, both internally by linking pages that are related as well as externally by people that link to their content.

I get about 50 emails daily confirming that the strategy worked, those people have lost their content and find it again through google and this is the only copy of it on the web, in spite of active attempts at making clones.

The ip blocklist for that site is close to 10,000 IPs.

> Even submitting a page in a sitemap doesn't ensure it gets crawled, and it certainly won't get crawled if you have a large site and don't set up your sitemap just right.

No, if you don't set it up right then of course, it won't work, but that goes for most things in the technology world.

Right now there are 197,000 pages indexed on that site according to google so I really can't complain, it seems to have worked very well.


Ok, well maybe bullshit is a strong term, so I apologize.

Still, it seems really odd to be using a sitemap like that. That's certainly not the intended purpose, and if that were a reasonable defense, what's to stop the spammers from employing it as an offense?

As far as I can tell, you use a sitemap if your site doesn't map in the normal way. Even then, it's not nearly as effective as a text site with any in-links - so much so that I wouldn't hinge any SEO strategy on a sitemap, and I wouldn't recommend setting one up to any new start-up founder. Even if your observations on this are correct, then you are a very special case.

It's also worth pointing out that it seems your site map was no defense until you took other measures, which wasn't clear from your original post:

"so at some point I shut that down and submitted the site to google"

Responding to some of your points:

"I get about 50 emails daily confirming that the strategy worked..."

* This confirms that your site is indexed, but it doesn't confirm your strategy beat the spammers. It may be more of Google's algorithms deciding you are best, no? I can't say for sure obviously, but you might have fared better if you hadn't done this at all.

"Even today that crawl isn't 100% complete yet"

* Part of the reason for this is because your pages have a low-priority to be crawled, because they are submitted via a site map.


> Ok, well maybe bullshit is a strong term, so I apologize.

no problem.

> Still, it seems really odd to be using a sitemap like that. That's certainly not the intended purpose,

Agreed, but that's what we're hackers for right ?

> and if that were a reasonable defense, what's to stop the spammers from employing it as an offense?

That they didn't have the URL, but the google bot did.

> As far as I can tell, you use a sitemap if your site doesn't map in the normal way.

Or if your site has crappy navigation, or if you want to comply with some countries' accessibility rules.

> Even then, it's not nearly as effective as a text site with any in-links - so much so that I wouldn't hinge any SEO strategy on a sitemap, and I wouldn't recommend setting one up to any new start-up founder.

Agreed, but we're not all just start-up founders, and even if we are we're not all above using the occasional trick to get an edge.

> Even if your observations on this are correct, then you are a very special case.

That's possible.

> It's also worth pointing out that it seems your site map was no defense until you took other measures, which wasn't clear from your original post:

"so at some point I shut that down and submitted the site to google"

The thing I shut down was the publicly accessible version of the map, so only google had access to the real thing.

> Responding to some of your points:

"I get about 50 emails daily confirming that the strategy worked..."

* This confirms that your site is indexed, but it doesn't confirm your strategy beat the spammers. It may be more of Google's algorithms deciding you are best, no? I can't say for sure obviously, but you might have fared better if you hadn't done this at all.

Google keeps something called a 'quad' list if I'm not mistaken, which contains a series of word ids in sets of four for every page that exists in their index. If certain sets of 'quads' are unique to your page then those are used to judge your page as original for that bit of content. I gather that quite a few of the 'random keywords spam pages' are built on that predicate. I'm not sure if that is outdated or even plainly wrong but it would explain the pattern in clone sites sometimes ranking higher than the originals simply because they got crawled earlier.

> "Even today that crawl isn't 100% complete yet"

* Part of the reason for this is because your pages have a low-priority to be crawled, because they are submitted via a site map.

That's quite possible, but I'm not in a hurry. The project was huge and if it takes a year to get it indexed I'm perfectly content with that.

The 197,000 pages result in about 7000 unique visitors daily, the total number of pages is in the millions.


Ok, well, you win again.


The advantage of a sitemap, linked to from say a footer link on every page of your site, is that it's easier to find than a page 4 levels down. This actually makes all pages on your site "findeable" at 2 levels down (homepage > sitemap > target page). I think it should improve indexing on otherwise hard to find pages.


That's been my experience, as well.


Great summary.

I'd add that SEO is not just a ranking exercise-- it's a conversion exercise. On top of all of the factors that you need to consider for ranking, you should consider that the first ~60 characters of you <title> tag and the first ~150 of your meta-description are what drives people to click through to your site.

Also, all other things being equal, the first position on SERPs gets ~42% of clicks... So it's a winner take all game unless you are talking about a content business.

OH-- also do some keyword analysis. In Andrew's example-- how many people search for "fishing tips" versus "fishing advice" or "how to fish" or "intro to fishing"? Keyword analysis is about finding out what people are searching for and then digging in to see how competitive those keywords are. If you want to dethrone the #1 result for "fishing tips" you'll want to analyze their SEO. If they are a 10 year old page-rank-9 domain with 87,500 inbound links, you might consider gunning for a less sought-after search term.


Excellent point. I was in charge of much of the SEO of a large internet retailer and our organic traffic was stunning. Unfortunately no matter how hard I lobbied to start work on conversion, split testing etc, they insisted I do nothing but drive traffic to our poorly converting sites. When sales started flattening out despite the continuing month over month increase in organic traffic I was let go as a cost saving measure. And while still an Internet Retailer 500 company their position has slipped significantly down the ranks. At the time I left we had top ten positions for hundreds of our most important key words including #1 position for many of our first tier words. When doing SEO remember that traffic is only as good as it converts.


This all seems like good advice too. I think the top SERP gets more than 42% of clicks though... more like 70. SEO is in many ways a winner-take-all sport at this point, and it's gotten more so over the years. Google released some eye-tracking data and people just look at the top 1 and then top 3 results, then they re-search.


Don't forget titles for links and alts for images! Also, if your site may potentially produce duplicate content, due to parameters in URLs etc, look into canonical urls:

http://www.seomoz.org/knowledge/duplicate-content




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: