Duplicate Content – Do you have yours in check?

Duplicate content is one of the fastest killers of organic ranking that I’ve ever seen. All your beautiful inbound marketing and SEO efforts…destroyed. You’ll be surprised what actually consists of duplicate content on your site, so be sure that you check yours regularly for duplicate content and adjust. Here’s the why, what and how:

Why it matters

Take your mind back to approx. 2006, a time where pages were stuffed with keywords on unlinked pages or within their code. This was practiced to send a stronger indication to search engines about what a site was all about. Soon after, this practice was outlawed and called for what it was: cheating. It made sense that as a next step duplicate content would also be penalised. To be honest, I agree with the rule. If you simply create copies of your pages, you are not offering anything valuable to the searchers and your potential audience.

Although the duplicate content rule has been around for a number of years now, you’ll be surprised how many sites still have duplicates or very common content. Even if you think you don't, we'd recommend you check.

I have seen sites going from strong organic visits to near zero in a matter of minutes of being identified as having duplicate content. Building it up takes weeks if not months of cleaning your site, creating new content and promoting your content heavily (often with a PPC booster).

Duplicate content matters because it’s not a good experience for your audience, because you will lose your organic ranking and because of the workload associated with recovering after you’ve been caught out.

What is considered duplicate content?

Imagine this, you are a marketing school trying to attract audiences from different countries to your new marketing course. To help potential students with country specific information (i.e. what visas they needs, language queries, what they can expect culturally etc), you create versions of the course page for each of the countries. Each page has a separate URL, page title etc. That’s valuable to the searcher, right? Yes. That’s more indexed pages, right? Yes. That’s good for rankings, right? Wrong. If the pages you’ve created carry mainly the same content, they are highly likely to be seen as duplicate. In this example a large section of each page would have the same course copy. That’s already duplicate content.

Ecommerce sites can easily struggle with this. If you have separate pages for products that are largely similar (e.g. t-shirts sold in blue, green, red and white), then again these pages are seen as duplicate even though they are valuable to the user.

Avoid content that is a one to copy (this can happen easily if you amalgamate two websites or blogs) and copy that is largely the same or very common across your pages. Both are considered duplicate content.

How to identify and handle duplicate content

In my favourite free SEO tool blog I mentioned Siteliner.com. This is a great start check your page for duplicate content. It looks not just at one to one duplication but common content.

Nothing beats a good SEO review to identify duplicate content. As you go through each of your pages checking for on-page SEO effectiveness, you read the copy and will soon notice if something sounds like you’ve heard it before. Keep track of all your pages and check manually for duplication.

When you identify duplicate content you have a several options to fix it:

1. Rewrite content

Here is the easiest way to fixing duplicate content: rewrite the duplicate. It’s natural to have some common content (you are using your keywords a number of times, talk about the same industry on your blog etc). Work with common sense and tools to identify which content is just slightly common and which you should amend. Then get to rewriting. Make sure you update all your on-page SEO elements as well like the page title, subheaders, meta description etc.

2. Canonical tag

The canonical tag is really handy. I used this quite a bit when I was working for an NGO a number of years ago. As the small Irish branch, we had little staff to write original content but our supporters were active on our site and I wanted to keep engagement here and not send them elsewhere. With offices all around the world, I took blog content from other sites and reused it. That’s fine with the addition of the canonical tag which tells Google: “Hey, I’m not cheating. I’m sharing content and I’m telling you where I took it from and where to put the SEO juice.”

It’s a simple fix that anyone who can access the HTML of your pages can do (so yeah, anyone). Code looks like this:

It needs to be placed in the header.

The downfall of the canonical tag (you will have noticed in the above example), the SEO juice gets passed to the original source, your site will not benefit. If you want the juice, you have to write original content.

Often the tag is used to identify variations of your page (e.g. http vs https, www vs non-www). If you want to read more about this and canonical tags in general, check out this post from Moz.

3. Robots.txt

Similar to the canonical tag, these files tell Google to not follow a page link. It’s often used in ecommerce where you might sell the same item in different colours and you simply can’t create original, unique content for every colour sold.

A little more complicated to set up but something you really need to wrap your head around if you fall into the ecommerce space.

You can get an intro into Robots.txt files and how to set them up here.

It’s not “if” it’s “when” you’ll be caught

While I don’t want to scare you, I will...Google will eventually catch you with duplicated content. It’s not a matter of “if”, it’s “when”. And when they do, they will see it as cheating and dump your rankings. Even if your duplicate or very common content is unintentional (i.e. you didn’t want to cheat but just deliver a great experience for users), your rankings will drop into abyss.

Our advice, get working on eliminating duplicate content today.

Duplicate Content – Do you have yours in check?

Why it matters

What is considered duplicate content?

How to identify and handle duplicate content

1. Rewrite content

2. Canonical tag

3. Robots.txt

It’s not “if” it’s “when” you’ll be caught

Written by Evelyn Wolf

About Us

Contact Us

Explore Our Site