Double Your Trouble: Google Highlights Duplication Issues
Maile Ohye posted a great piece on Google Webmaster Central on the effects of duplicate content as caused by common URL parameters. There is great information in that post, not least of which it validates exactly what a few of us have stated for a while: duplication should be addressed because it can water down your PageRank.
Maile suggests a few ways of addressing dupe content, and she also reveals a few details of Google’s workings that are interesting, including:
- Unnecessarily long URLs are unattractive, and might reduce chances that a user would click through to your page. While this sounds like a subjective opinion that’s a bit counter-intuitive (because one could assume that users focus more on link titles than length of URLs), it’s quite possible that Google would have done enough experimentation during their usability testing to know for sure that longer URLs might actually have a negative impact on click-through rates. So, avoid longer URLs if possible in your application design!
- When deciding what to display from your site for a user’s search, if Google detects duplicate content matching the user’s query, they’ll group all the dupe pages into a cluster, and then apply some methods to choose which of your pages would be the best choice to present to their searcher.
- They attempt to focus the collective “link popularity” or PageRank from all members of a cluster on your site to one page. This is slightly odd, since it runs counter to her earlier statement that duplication can cause “link popularity” dilution. Likely, this means that there are cases when Google can find it difficult to cluster all dupes from a site, so it’s still best to reduce duplication rather than solely rely upon their algorithms to handle it for you.
- She suggests using a Sitemap to inform them of the primary URLs of a site, which suggests that Google may be using the sitemaps as a prime indicator when trying to select a canonical URL for a particular cluster. I note that while using a sitemap could help Google select which of your dupe pages to present to a user, it doesn’t really solve your entire dupe problem — you should still try to use additional methods to manage dupes.
Maile’s suggestions reiterate some of the de-duplication advice I’ve previously given, and I’ve also suggested having your site resolve to a single domain name to reduce duplication (along with Google’s Matt Cutts, who also recommends domain canonicalization).
Each of the search engines handle duplication issues a little differently, making it desirable to use best practices to manage the issue, if you really want to improve your site’s natural search performance.
Possible Related Posts
Posted by Chris of Silvery on 09/12/2007
Permalink | | Print | Trackback | Comments Off on Double Your Trouble: Google Highlights Duplication Issues | Comments RSS
Filed under: Best Practices, Dynamic Sites, Google, PageRank, Search Engine Optimization, SEO, Site Structure, URLs Canonicalization, duplicate-content, duplication, Google, Search Engine Optimization, SEO
No comments for Double Your Trouble: Google Highlights Duplication Issues
No comments yet.
Sorry, the comment form is closed at this time.