Reunifying duplicate content on your website

Tuesday, October 06, 2009

Handling duplicate content within your own website can be a big challenge. Websites grow; features get added, changed and removed; content comes—content goes. Over time, many websites collect systematic cruft in the form of multiple URLs that return the same contents. Having duplicate content on your website is generally not problematic, though it can make it harder for search engines to crawl and index the content. Also, PageRank and similar information found via incoming links can get diffused across pages we aren't currently recognizing as duplicates, potentially making your preferred version of the page rank lower in Google.

Steps for dealing with duplicate content within your website

  1. Recognize duplicate content on your website. The first and most important step is to recognize duplicate content on your website. A simple way to do this is to take a unique text snippet from a page and to search for it, limiting the results to pages from your own website by using a site: query in Google. Multiple results for the same content show duplication you can investigate.
  2. Determine your preferred URLs. Before fixing duplicate content issues, you'll have to determine your preferred URL structure. Which URL would you prefer to use for that piece of content?
  3. Be consistent within your website. Once you've chosen your preferred URLs, make sure to use them in all possible locations within your website (including in your Sitemap file).
  4. Apply 301 permanent redirects where necessary and possible. If you can, redirect duplicate URLs to your preferred URLs using a 301 response code. This helps users and search engines find your preferred URLs should they visit the duplicate URLs. If your site is available on several domain names, pick one and use the 301 redirect appropriately from the others, making sure to forward to the right specific page, not just the root of the domain. If you support both www and non-www host names, pick one, use the preferred domain setting in Webmaster Tools, and redirect appropriately.
  5. Implement the rel="canonical" link element on your pages where you can. Where 301 redirects are not possible, the rel="canonical" link element can give us a better understanding of your site and of your preferred URLs. The use of this link element is also supported by major search engines such as, Bing and Yahoo!.
  6. Use the URL parameter handling tool in Google Webmaster Tools where possible. If some or all of your website's duplicate content comes from URLs with query parameters, this tool can help you to notify us of important and irrelevant parameters within your URLs. More information about this tool can be found in our