How to Find Duplicate Content on Your Site & Improve Your SEO

Do you know how to find duplicate content and fix it?

If not, you should.

Duplicate content can cause quite an SEO headache.

In fact, it can confuse Google’s crawlers and bring down your rankings, all without your knowledge.

You may be there right now – wondering why some of your pages aren’t ranking as highly as they could be. Maybe you’ve spent days staring at your computer screen with bloodshot eyes trying to figure out what’s going wrong. 😣

Duplicate content could be it, especially if you’ve never checked for it before (let alone heard of it).

Fun fact: Duplicate content accounts for 29% of the entire web, according to the most recent study I could find from 2015. These days, that percentage is probably even higher if we account for the sheer amount of content that gets published daily.

So let’s stop this problem before it drives your website over a cliff. It’s time to learn how to find duplicate content and fix it. 🔧

That’s exactly what we’ll discuss in this guide.

find duplicate content

What Is Duplicate Content (And Why Should You Care About It)?

Duplicate content is just what it sounds like: exact copies or similar versions of content that appear either on separate websites or the same website.

Let’s examine each scenario:

  • Duplicate content on separate websites – This, my friends, is plagiarism. If some entity other than you snatches an exact copy of your content and publishes it on their website, they’re stealing your work and ideas.
    • The same goes even if that person/brand/organization was using your page as a reference and didn’t properly paraphrase or rewrite the content in their own words. To learn more about plagiarism (and its seriousness), check out this article from the University of Oxford.
    • The same goes if the situation is reversed: If you copy or inadequately paraphrase someone else’s content (intentionally or not), you’re the plagiarizer and have created duplicate content.
  • Duplicate content on the same website – This is when extremely similar or exact-match content appears on multiple pages of your site. This scenario is much more common, especially if your website is large with hundreds or even thousands of pages of content. However, it can happen to smaller websites, too, and it’s usually totally unintentional.

Profitable Content Marketer Skills Cheat Sheet

Why Is Duplicate Content a Problem?

When duplicate content is plagiarized content, the problem is obvious. ❌ Conversely, the problem with duplicate content on your own site boils down to Google rankings.

When you have two (or more) pieces of content that look nearly identical, Google won’t know which one to rank. In the end, this drives down your rankings for all of the pages involved – even if the content is fantastic.

And rankings are what bring in traffic and leads. For SEO blogging to work, your pages need to rank highly and appear at the top of Google for your keywords. That’s because:

  1. Few users take their Google search past page one. On average, the clicks beyond that are abysmal – only 78% of users click on something on page two.
  2. Compare that to the #1 position on Google, which nets you a click-thru rate (CTR) of 6%, which amounts to over 5 MILLION average clicks.

For SEO to work, you need to reach page one. And you won’t do that with duplicate content.

SO – let’s talk about how to find duplicate content and fix it using two great tools: Copyscape (both the free and premium versions) and Siteliner.

(By the way, keyword cannibalization is a related SEO issue to duplicate content. Learn about it in my video below [an oldie but a goodie].)

How to Find Duplicate Content on Your Website Using Siteliner

Siteliner is a tool that will scan your entire website to find duplicate content.

For smaller websites, the free version will give you plenty of data to work with, since it will scan up to 250 pages once a month. (If you have a larger site or want full access to all the data and features, you’ll need to spring for the premium version.)

To perform a site scan, simply enter your URL into the search box.

siteliner

When your report is ready, you’ll see lots of useful information, like how many pages were checked, what percentage of your content is duplicated, and stats about how your site stacks up to others.

siteliner free report

Click on “Duplicate content” in the top left menu to see a detailed breakdown.

When you look at your report, don’t worry if you see some high match percentages at the top, especially if these are your main website pages (product pages, “about” page, landing pages, etc.).

That’s because this tool will show you EVERY instance of duplicate content on a page, including menus, excerpts, footers, and sidebar content.

siteliner duplicate content list

What you need to worry about are larger chunks of content appearing across multiple pages.

For example, the first page that isn’t a main site page on my duplicate content list is a blog. It has 467 words matching another page.

To check if this matching content is part of regular text repeated across my site or something more serious, I can click on that entry in the list to see exactly where the duplicate content comes from.

siteliner comparison

As you can see, there are three different sources:

  • Content that matches another page on my site (highlighted in pink)
  • Navigational content (highlighted in green)
  • Common content that normally appears across my site (highlighted in gray)

In this instance, I’d investigate the pink highlighted text and determine if I need to make any changes to either page.

See how that works? It’s pretty simple, and doing this monthly or quarterly could ensure duplicate content never drags down your Google rankings.

Besides SEO issues like duplicate content, what else is plaguing your online business growth? Are you struggling to hire, delegate, scale, or manage all the little details? Learn where you’re going wrong and get the pathway to success in my free training.

How to Find Duplicate Content on the Web Using Copyscape

Beyond finding duplicate content on your site, a great best practice before you publish any piece of content is to run it through a checker like Copyscape, especially if you outsource writers. This is how you:

  • Find out if your content is 100% unique and original
  • Discover any plagiarism issues that need correction

There are two ways to do this with two versions of Copyscape – the free version and the premium one.

By the way, Copyscape is run by the same people behind Siteliner. It’s another reliable tool that plenty of SEO pros use. It’s also super affordable, which makes it my top recommendation to check for plagiarism and duplicate content on the web.

Copyscape (Free Version): Check Published Content to Find Duplicate Content

The free version of Copyscape will only allow you to enter a URL (i.e., content that’s already published) to compare it to what’s on the web. Searches are limited.

Here’s how to use it:

Go to the Copyscape homepage, enter the URL of the content you’d like to check in the search box, and hit “Go.” For example, I’m checking a recent Content Hacker blog.

copyscape free tool

The first page that pops up will be a list of results that match the content you’re checking. This means at least some of the text is duplicated.

copyscape free tool -matching content results

For this example, all the results come from my content around the web, including my Amazon author page. That’s perfectly fine since I use similar wording in my bios and profiles to tell my story.

To look closer at a result, click on the blue text. This will show you exactly which text is duplicated and where it appears on the page.

copyscape free tool - matching content on found page

To see the duplicate text in action on your source page, click “See matching content in: Souce Page.”

copyscape free tool matching content views

This will show you exactly where the matching text appears on your source page.

copyscape free tool - matching content on source page

As you can see, this instance of duplicate text isn’t an issue. It’s just my bio, which stays fairly consistent across all of the platforms I publish on.

If you see other sites listed in the results that aren’t connected to you, dig deeper, and check the percentage of duplicate text. A match of 1-4% isn’t worth worrying about, for instance.

copyscape free tool match percentage

BUT, if you see vast chunks of text – 7% and above is a red flag – copied from your page to theirs, or vice-versa, you need rewrites, STAT.

online writing guide

Copyscape Premium: Check Unpublished Content to Find Duplicate Content

I prefer Copyscape Premium over the free version mainly because of how easy and affordable it is.

You get way more features in Premium, too, like batch searching, file uploads, and plagiarism tracking.

Here’s how to use it to check content before you publish and ensure it’s original ✅:

First, sign up for Premium by choosing a username and password.

Now, here’s where Copyscape Premium veers a bit from the online tools you might be used to. For one, there’s no subscription for this tool – instead, you purchase a bulk sum of credits, which you then spend on searches.

Pricing:

  • $0.03 for every search up to 200 words
  • An additional $0.01 per 100 words beyond your first 200
  • + You can use credits any time within 12 months of purchase

So, if you want to run a blog post of 2,000 words through Copyscape Premium, the total cost would be $0.18. (Like I said, affordable!)

So, go ahead and purchase however many credits you’d like.

copyscape purchase premium credits

Then head back to Premium search.

Now we can upload our unpublished content file to check it against the web. Underneath the text box (where you can paste a section of text to check), find the “Choose File” button and click it.

Find where your content file is saved and open it. Then click the “Premium Search” button.

For this example, I’m checking a blog that’s still in the draft stage.

copyscape premium file upload

The results page will show you any matches on the web with duplicate content.

In my blog draft, I included a snippet of code for embedding a video, and that’s the only text that’s showing up as a match in my results. That means this piece is 100% original! 💯

copyscape premium file upload results

However, if you’re seeing any matches on your content that perk up your attention, you can click on each result to view more details and find the match percentage – just like with the free version of Copyscape.

And, though it goes without saying, if you find you’ve inadvertently copied someone else, edit your content to be 100% unique.

Break Free from SEO Worries Like Duplicate Content: Here’s How

Finding duplicate content on your site and fixing it is super important.

But most business owners don’t even realize they’re committing this SEO mistake, let alone losing content ROI.

I’ve given you some tools and tips on how to find duplicate content, but what if you didn’t have to worry about this at all?

Even further, what if you didn’t have to worry about content – period?

What if, instead, your content was done for you from the ground up, including…

  • A dedicated writer trained in online & SEO writing
  • Brand style guidelines
  • Content topics mapped to a content calendar (and vetted for originality)
  • Content management

No “what-ifs.” This exists:

It’s our new service at Content Hacker, our Done-For-You Content Creation Engine, now accepting clients. 🚂

If you’re ready to hand off content to humans who know how to do it right, talk to us today to get started.

content engine

About Julia McCoy

Julia McCoy is an entrepreneur, 6x author, and a leading strategist around creating exceptional content and brand presence that lasts online. At 19 years old, in 2011, she used her last $75 to build a 7-figure agency, Express Writers, which she grew to $5M and sold ten years later. In the 2020s, she's devoted to running The Content Hacker, where she teaches creative entrepreneurs the strategy, skills, and systems they need to build a self-sustaining business, so they are finally freed up to create lasting legacy and generational impact.