Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> It would be trivial to do some rudimentary parsing on the url string to determine where you really wanted to go

Specific to this point, a new project I'm building supports "pretty" URLs and I've found my (now) favourite solution is to build an aliases system.

It works like so: when a user creates an item an "alias" is registered, it's set to "current" and all future queries to that alias are logged. If the user causes a change to the URL in future (name change, etc.) then the new alias is registered but the old one is retained and 301s to the new alias. All aliases are accessible by the user and they can invalidate them manually (if they want to re-use an alias for example) however if an alias has had a large amount of hits from a single source since that alias was retired (say 50 referrals from website.com to mysite.com/previous-alias) the system assumes that the user posted the link on another website and so invalidating that alias will cause a dead link (and lose my site traffic) so it doesn't allow it.

I guess it's convoluted and adds extra overhead but I feel like if you have pretty URLs (which are in my opinion something that a website should aim for) you need to be in a position where they're not going to cause the site to break the rest of the internet. The easy solution is to have pseudo pretty URLs (eg: website.com/123-pretty-url, where 123 = ID and pretty-url is just an ignored string) or just not allow URLs to ever be changed, but I don't like either.

I wonder if any other websites have a good approach to this.



At Stack Overflow we use slugs for this approach,

http://stackoverflow.com/questions/427102

http://stackoverflow.com/questions/427102/what-is-a-slug

etc, will all redirect to the canonical:

http://stackoverflow.com/questions/427102/what-is-a-slug-in-...

If the title changes we update the slug and redirect with a 301 to the new canonical


I wouldn't recommend having the "pretty" part not validated. It can cause some serious issues with google & duplicate content and if someone wants to be malicious they can create a bunch of fake urls that essentially point to the same page, or even worse if they receive enough links they can be indexed at the new "fake" url. A similar thing happened to a newspapers website but I can't recall who of the top of my head.

Another potential solution & my preferred method is whenever a change is made that would affect the url of a page. Update a "legacy" table with the old url and the location of the new url, next time a 404 is going to be thrown do a search against the database & redirect accordingly if a new url is found. I rolled this approach into https://github.com/leonsmith/django-legacy-url and whilst it's not polished it's by far the easiest & probably most automatic/maintainable solution I have found.


"I wouldn't recommend having the "pretty" part not validated. It can cause some serious issues with google & duplicate content"

Not if you properly generate and apply canonical links :)


canonical links are only hints to google (all be it very strong) they always reserve the right to ignore it if they think a webmaster is shooting themselves in the foot, that in itself is where the problem is. If I built up a few hundred links to example.com/1234-this-site-sucks I'm sure google would think that is the correct url rather than the canonical link version of example.com/1234-the-real-slug


FYI: the word you wanted is "albeit".


Your approach is exactly how Drupal does it, and it's one of the things I really felt that Drupal got right (and have mimicked in websites I have built since).




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: