Thursday, October 9, 2008

Google debunks the myth of the duplicate content penalty

Google debunks the myth of the duplicate content penalty

I read this somewhere and I cannot remember where....

Ok, this is kind of a "so what, big deal" because it never really was a duplicate content penalty. It has always been more of a duplicate content problem. Regardless, Google has now officially confirmed that the duplicate content penalty doesn't exist. You're not actually penalized by Google for having duplicate content on your site.

Rather, the issue is that duplicate content causes problems for you in lots of other ways. For instance, if you have different URLs pointing at the same content, Google will only show one of the those URLs in the search results. The other URLs aren't banned or penalized, they just don't get shown. Google does this to avoid displaying redundant listings in the search results.

The problem is that Google might not choose the URL you want them to use. Also, the more time that Google's spider spends crawling and filtering out redundant URLs, the less time it's likely to spend indexing your important pages.

Anyway, here's how to entice Google to always index the "correct" URL without wasting time dealing with duplicate content:

  • Solve your site's canonical problems with a 301 redirect.
  • Be consistent in your internal linking. Link to the version of URL you want indexed, and be consistent in how you format your links.
  • When creating a Google Sitemap, make sure the URLs in your sitemap are the ones you want indexed.
  • Block duplicate versions of your URLs using your robots.txt file

Remember, duplicate content won't cause your site to be banned or penalized—but it certainly can cause your site to perform badly in the rankings. Obviously, it's a problem you'll want to avoid. For more information, see our in-depth report: The SEO's Guide to Preventing Duplicate Content.

Of course, if you're operating on the dark side and stealing content from other sites on a large scale in order to spam the search engines (such as with scraper sites), then all bets are off. In such cases you can certainly expect a penalty and possibly a ban. But short of content scraping (and scrapers tend to be fully aware of the risks involved), dup-content penalties aren't your worry.

A more realistic concern is duplicate content produced when you share content with other websites. In such cases when multiple sites display the same content, Google will only list one of these pages in the search results while filtering out the rest of them.

Some of the criteria that Google uses to measure which page deserves being displayed in the search results include:

  • The site where Google first indexed the page.
  • The authority of the site hosting the page.
  • Whether the various copies link back to an original source.

If you let other sites reprint content, make sure they link back to you. That should ensure you'll be credited as the original source. We also suggest you embed links back to your site within your articles. By doing so you're making it clear to Google which site is the originator, even if people steal your content without permission. And, if content-theft is a problem, be sure to read our report: What To Do When Someone Steals Your Content.

Also take note that filtering of duplicate pages is a big reason why traditional article marketing doesn't work very well anymore. If you syndicate a single article to dozens of article directories, you'll only get credit for a single link because all of the other copies of your article are filtered out of the search results. Be sure to read our report on article marketing done right where we show you how to get around this issue and get credit for your article marketing links.

Finally, keep in mind that if you've had duplicate content problems on your site, you don't need to file a reconsideration request because you haven't actually been penalized. All you'll need to do is solve the problems by taking appropriate actions.

For more information on how Google handles duplicate content, see the Google help files.






No comments: