Duplicate Content

Definition

What is duplicate content?

 

Simply put, duplicate content refers to any piece of web content (or section thereof) that appears across several different places online either in the exact same form or with virtually insignificant alterations.

 

If you wish to gain a more in-depth understanding of this topic, check out the FAQ section below:

 

Question #1: Is duplicate content bad?

 

No, according to Google itself, duplicate content is not always bad. It is only duplicate content that is meant to deceive search engines and manipulate search rankings that is a problem.

 

Examples of duplicate content that are not inherently harmful include:

 

  • Websites that live on two or more domains
  • Websites that have a full desktop version and a stripped-down mobile version
  • Websites that have a full and printer-only version

 

In all three examples, while the content obviously appears across more than one place online either in the exact same form or with minimal alterations, there is clearly no intention to deceive search engines.

 

This, however, does not automatically mean that you are in the clear. Unfortunately, search engines are not always able to accurately tell innocent duplicate content from those with malicious intent, so you need to take some extra steps to ensure they know you have no ill intentions—which we will talk about in more detail later.

 

Question #2: How does duplicate content affect SEO?

 

While non-malicious duplicate content has no real effect on SEO, duplicate content that is used to manipulate search rankings get penalised because Google’s main goal is to provide users with accurate, relevant, and distinct search results at all times.

 

So, whenever Google finds websites that contain content that is duplicated for the purpose of tricking search engines, it either pushes them down its search engine results pages (SERPs) or completely delists them.

 

If your content gets flagged and your site gets penalised, all you have to do is fix the issues and submit your site to Google for reconsideration—which brings us to the next point:

 

Question #3: How do I fix duplicate content?

 

The easiest way to fix duplicate content is to avoid creating them in the first place if you can. For duplicate content you cannot avoid, however—just like the ones we have seen in the “Is duplicate content bad?” section—there are several fixes you can implement, including:

 

  • Canonicalization
  • Using the ‘noindex’ tag
  • Expand or merge duplicate content

 

Let us take a closer look at each one in more details:

 

First, canonicalization lets you tell Google which version of your duplicate content you want to show up on SERPs. With this approach, you are still allowing Google’s spiders to crawl all copies of the duplicate content but preventing more than one copy from coming up as a search result.

 

Second, using the ‘noindex’ tag lets you tell Google which pages on your website it should not crawl. This means that it is a great way to keep duplicate content on your website from being indexed and shown on SERPs. This approach is perfect for something like a set of product pages that contain different variants of the same product.

 

Finally, you can also just merge pages that contain exactly the same or near identical content—either that or add more unique content each version so they are no longer the same.

 

If you want more information on how you can fix duplicate content, check out Google’s guidelines here.

 

Question #4: What do I do if someone scrapes my content?

 

In most cases, you do not have to do anything if someone scrapes your content and duplicates it on their website. Google is pretty good at identifying scraped content and automatically penalises websites that do it.

 

The only time you need to take action is when a website that scraped your content outranks you on Google’s SERPs. This means they have somehow successfully tricked Google into thinking that theirs is the best version to display on search.

 

When this happens, the first thing you need to do is contact the owner of the website in question and ask them to remove your content from their site. If they refuse, you can report them to Google directly.