Do you know how to find duplicate content and fix it?
If not, you should.
Duplicate content can cause quite an SEO headache.
In fact, it can confuse Google’s crawlers and bring down your rankings, all without your knowledge.
You may be there right now – wondering why some of your pages aren’t ranking as highly as they could be. Maybe you’ve spent days staring at your computer screen with bloodshot eyes trying to figure out what’s going wrong. 😣
Duplicate content could be it, especially if you’ve never checked for it before (let alone heard of it).
Fun fact: Duplicate content accounts for 29% of the entire web, according to the most recent study I could find from 2015. These days, that percentage is probably even higher if we account for the sheer amount of content that gets published daily.
So let’s stop this problem before it drives your website over a cliff. It’s time to learn how to find duplicate content and fix it. 🔧
That’s exactly what we’ll discuss in this guide.
What Is Duplicate Content (And Why Should You Care About It)?
Duplicate content is just what it sounds like: exact copies or similar versions of content that appear either on separate websites or the same website.
Let’s examine each scenario:
- Duplicate content on separate websites – This, my friends, is plagiarism. If some entity other than you snatches an exact copy of your content and publishes it on their website, they’re stealing your work and ideas.
- The same goes even if that person/brand/organization was using your page as a reference and didn’t properly paraphrase or rewrite the content in their own words. To learn more about plagiarism (and its seriousness), check out this article from the University of Oxford.
- The same goes if the situation is reversed: If you copy or inadequately paraphrase someone else’s content (intentionally or not), you’re the plagiarizer and have created duplicate content.
- Duplicate content on the same website – This is when extremely similar or exact-match content appears on multiple pages of your site. This scenario is much more common, especially if your website is large with hundreds or even thousands of pages of content. However, it can happen to smaller websites, too, and it’s usually totally unintentional.
Why Is Duplicate Content a Problem?
When duplicate content is plagiarized content, the problem is obvious. ❌ Conversely, the problem with duplicate content on your own site boils down to Google rankings.
When you have two (or more) pieces of content that look nearly identical, Google won’t know which one to rank. In the end, this drives down your rankings for all of the pages involved – even if the content is fantastic.
And rankings are what bring in traffic and leads. For SEO blogging to work, your pages need to rank highly and appear at the top of Google for your keywords. That’s because:
- Few users take their Google search past page one. On average, the clicks beyond that are abysmal – only 78% of users click on something on page two.
- Compare that to the #1 position on Google, which nets you a click-thru rate (CTR) of 6%, which amounts to over 5 MILLION average clicks.
For SEO to work, you need to reach page one. And you won’t do that with duplicate content.
SO – let’s talk about how to find duplicate content and fix it using two great tools: Copyscape (both the free and premium versions) and Siteliner.
(By the way, keyword cannibalization is a related SEO issue to duplicate content. Learn about it in my video below [an oldie but a goodie].)
How to Find Duplicate Content on Your Website Using Siteliner
Siteliner is a tool that will scan your entire website to find duplicate content.
For smaller websites, the free version will give you plenty of data to work with, since it will scan up to 250 pages once a month. (If you have a larger site or want full access to all the data and features, you’ll need to spring for the premium version.)
To perform a site scan, simply enter your URL into the search box.
When your report is ready, you’ll see lots of useful information, like how many pages were checked, what percentage of your content is duplicated, and stats about how your site stacks up to others.
Click on “Duplicate content” in the top left menu to see a detailed breakdown.
When you look at your report, don’t worry if you see some high match percentages at the top, especially if these are your main website pages (product pages, “about” page, landing pages, etc.).
That’s because this tool will show you EVERY instance of duplicate content on a page, including menus, excerpts, footers, and sidebar content.
What you need to worry about are larger chunks of content appearing across multiple pages.
For example, the first page that isn’t a main site page on my duplicate content list is a blog. It has 467 words matching another page.
To check if this matching content is part of regular text repeated across my site or something more serious, I can click on that entry in the list to see exactly where the duplicate content comes from.
As you can see, there are three different sources:
- Content that matches another page on my site (highlighted in pink)
- Navigational content (highlighted in green)
- Common content that normally appears across my site (highlighted in gray)
In this instance, I’d investigate the pink highlighted text and determine if I need to make any changes to either page.
See how that works? It’s pretty simple, and doing this monthly or quarterly could ensure duplicate content never drags down your Google rankings.
Besides SEO issues like duplicate content, what else is plaguing your online business growth? Are you struggling to hire, delegate, scale, or manage all the little details? Learn where you’re going wrong and get the pathway to success in my free training.
How to Find Duplicate Content on the Web Using Copyscape
Beyond finding duplicate content on your site, a great best practice before you publish any piece of content is to run it through a checker like Copyscape, especially if you outsource writers. This is how you:
- Find out if your content is 100% unique and original
- Discover any plagiarism issues that need correction
There are two ways to do this with two versions of Copyscape – the free version and the premium one.
By the way, Copyscape is run by the same people behind Siteliner. It’s another reliable tool that plenty of SEO pros use. It’s also super affordable, which makes it my top recommendation to check for plagiarism and duplicate content on the web.
Copyscape (Free Version): Check Published Content to Find Duplicate Content
The free version of Copyscape will only allow you to enter a URL (i.e., content that’s already published) to compare it to what’s on the web. Searches are limited.
Here’s how to use it:
Go to the Copyscape homepage, enter the URL of the content you’d like to check in the search box, and hit “Go.” For example, I’m checking a recent Content Hacker blog.
The first page that pops up will be a list of results that match the content you’re checking. This means at least some of the text is duplicated.
For this example, all the results come from my content around the web, including my Amazon author page. That’s perfectly fine since I use similar wording in my bios and profiles to tell my story.
To look closer at a result, click on the blue text. This will show you exactly which text is duplicated and where it appears on the page.
To see the duplicate text in action on your source page, click “See matching content in: Souce Page.”
This will show you exactly where the matching text appears on your source page.
As you can see, this instance of duplicate text isn’t an issue. It’s just my bio, which stays fairly consistent across all of the platforms I publish on.
If you see other sites listed in the results that aren’t connected to you, dig deeper, and check the percentage of duplicate text. A match of 1-4% isn’t worth worrying about, for instance.
BUT, if you see vast chunks of text – 7% and above is a red flag – copied from your page to theirs, or vice-versa, you need rewrites, STAT.
Copyscape Premium: Check Unpublished Content to Find Duplicate Content
I prefer Copyscape Premium over the free version mainly because of how easy and affordable it is.
You get way more features in Premium, too, like batch searching, file uploads, and plagiarism tracking.
Here’s how to use it to check content before you publish and ensure it’s original ✅:
First, sign up for Premium by choosing a username and password.
Now, here’s where Copyscape Premium veers a bit from the online tools you might be used to. For one, there’s no subscription for this tool – instead, you purchase a bulk sum of credits, which you then spend on searches.
- $0.03 for every search up to 200 words
- An additional $0.01 per 100 words beyond your first 200
- + You can use credits any time within 12 months of purchase
So, if you want to run a blog post of 2,000 words through Copyscape Premium, the total cost would be $0.18. (Like I said, affordable!)
So, go ahead and purchase however many credits you’d like.
Then head back to Premium search.
Now we can upload our unpublished content file to check it against the web. Underneath the text box (where you can paste a section of text to check), find the “Choose File” button and click it.
Find where your content file is saved and open it. Then click the “Premium Search” button.
For this example, I’m checking a blog that’s still in the draft stage.
The results page will show you any matches on the web with duplicate content.
In my blog draft, I included a snippet of code for embedding a video, and that’s the only text that’s showing up as a match in my results. That means this piece is 100% original! 💯
However, if you’re seeing any matches on your content that perk up your attention, you can click on each result to view more details and find the match percentage – just like with the free version of Copyscape.
And, though it goes without saying, if you find you’ve inadvertently copied someone else, edit your content to be 100% unique.
Break Free from SEO Worries Like Duplicate Content: Here’s How
Finding duplicate content on your site and fixing it is super important.
But most business owners don’t even realize they’re committing this SEO mistake, let alone losing content ROI.
I’ve given you some tools and tips on how to find duplicate content, but what if you didn’t have to worry about this at all?
What if you had a content strategy in place that mapped out your content so well, you were always on top of potential duplicate content issues?
What if you built a content team at your disposal that created great content on the regular?
I teach you how to do all of this in my Content Transformation System — a 12-month coaching program that’s all about building your business through content the EXACT same way I did, that led me to six and seven figures.
Ready to find out if you’re a good fit? Apply today.