"Duplicate content" sounds like it's about word-for-word copying - someone scraping your blog post or you accidentally publishing the same article twice. The actual operational problem is different and much more common: two pages on YOUR OWN SITE targeting the same query intent, with overlapping coverage, fragmenting the ranking signals between them. Neither one ranks where it should because Google can't figure out which is the canonical answer. This is keyword cannibalization - the most under-diagnosed reason established sites stop earning rank lifts even when they keep publishing.
This guide goes straight to the workflow: the cannibalization mental model, the on-site vs cross-site duplicate distinction, the consolidation decision tree for fixing duplicates at scale, and the audit move that finds the worst offenders in 30 minutes.
Find duplicate titles and descriptions across your sitemap
Two tools work together here. The duplicate-page-title checker crawls a sitemap and groups pages with identical titles. The duplicate-meta-description checker does the same for descriptions. Identical titles or descriptions are usually a strong signal that two pages are competing for the same query - the diagnosis starts there.
The cannibalization mental model
The right way to think about duplicates is competition, not copying. Two of your own pages targeting the same query - whether or not the body text overlaps - compete for the same SERP slot. Google's choice options:
- Pick one page as the canonical answer for that query. The other page either doesn't rank, or ranks with reduced signals because Google has split credit between the two.
- Switch the picked canonical week-over-week, producing the maddening pattern where your "main" target page rotates ranks 5/8/3/12 across consecutive weeks because Google can't decide which page should win.
- Decide neither is strong enough and rank a competitor instead.
None of these are good for you. The fix is to NOT have two pages competing for the same query - either consolidate them, differentiate them on intent, or pick a winner and demote the loser.
On-site duplicates vs cross-site duplicates
On-site duplicates (the common case)
Your own pages targeting overlapping intent. Examples:
- A blog post on "how to write SEO titles" published in 2021, AND a blog post on "best practices for SEO page titles" published in 2024. Same query intent, two competing URLs.
- A category page at
/products/crmAND a content hub at/learn/crm, both targeting "crm" as a head term. Different formats but overlapping target queries. - An old "what is X" article AND a new "complete guide to X" article. Both informational, both target the same head terms.
The fix: the consolidation decision tree (next section).
Cross-site duplicates (the rarer case)
Your content republished on other sites. Examples:
- You syndicate posts to Medium / LinkedIn. The platform's authority can outrank yours unless you reverse-canonical from the platform copy back to your URL.
- A scraper site copies your articles. Usually low-quality and Google handles correctly, but verify with periodic copy-detection (Copyscape, Google site: searches).
- A partner publishes guest posts they wrote AND keep on their site - sometimes word-identical to yours.
The fix: rel=canonical from the partner / syndicated copy back to your URL. See the canonical tag playbook for the cross-domain workflow.
The consolidation decision tree
For each on-site duplicate cluster (two or more pages competing for the same query), pick exactly one of these four moves:
1. Merge into one page
Best when: both pages cover overlapping ground, neither has unique angles the other lacks, and consolidating makes the resulting article more comprehensive.
Workflow: pick the URL that has more inbound links, more backlinks, and stronger Search Console history. Move the unique content from the other page into this one. 301 the loser to the winner. Backlinks consolidate; the merged page is now stronger.
2. Differentiate on intent
Best when: the two pages cover related topics but have legitimately different intents - one informational, one transactional, one commercial-investigation.
Workflow: rewrite the H1, title, meta description, and intro of each page to make the intent distinction clear. Use the intent-mapping playbook to verify each page's format matches its target intent. Then internal-link them as siblings, not competitors.
3. Pick a winner, noindex the loser
Best when: one of the pages is genuinely outdated or weaker, but you don't want to delete it (some pages serve as legacy reference material or have non-SEO purposes).
Workflow: add <meta name="robots" content="noindex"> to the loser. Update internal links to point at the winner instead. Loser stays accessible but stops competing in search.
4. Delete and 301
Best when: the loser page is genuinely no-value and has minimal inbound links.
Workflow: 301 the loser URL to the winner URL. Update internal links. The page is gone; equity transfers.
The biggest mistake: thinking only word-for-word copying counts
The "is this a duplicate?" question gets asked too narrowly. Most teams check whether the body text matches another URL byte-for-byte. By that test, almost nothing is a duplicate. By the operational test (do these pages compete for the same SERP slot?), most large sites have dozens of duplicates.
The audit move:
- Run the duplicate-title checker. Pages with identical or near-identical titles are almost always intent-duplicates, even if their body text differs.
- Pull your top 50 target queries from Search Console. For each, check how many of your URLs are showing impressions for that query. If 3+ of your URLs receive impressions for the same query, you have ranking fragmentation - usually cannibalization.
- Check Google for "site:yoursite.com [query]" on each cannibalized query. Multiple URLs returned that all target the same intent = consolidation candidates.
- Read the H1s side-by-side for any candidate cluster. If they're paraphrases of each other, intent overlap is real and consolidation is needed.
What a clean duplicate-content audit looks like
Run this quarterly on established sites, plus before / after any large content migration.
- Run the duplicate-title checker on your XML sitemap. Group by identical title. Each cluster of 2+ pages is a duplicate candidate.
- Run the duplicate-meta-description checker. Some duplicate intent clusters have unique titles but identical / boilerplate descriptions - this catches them.
- Pull top 50 queries by impressions from Search Console. Filter to queries where 3+ of your URLs are receiving impressions. These are cannibalization candidates.
- Apply the consolidation decision tree to each cluster: merge, differentiate, noindex-and-keep, or 301-and-delete.
- Update internal links after each consolidation. Old internal links pointing at the loser URL should now point at the winner.
- Verify the 301 chains and noindex flags with the HTTP status checker and a Search Console URL Inspection on the affected URLs.
- Re-test rank movement 4-8 weeks later. Cannibalization fixes typically produce noticeable rank lifts on the consolidated query within two crawl cycles.
Grab the one-page audit checklist
A printable version of the cannibalization mental model, the consolidation decision tree, the four-move workflow (merge / differentiate / noindex / delete), and a Google Sheet template for tracking duplicate clusters with the chosen consolidation move and the 301 / canonical / noindex implementation status.
Quick quiz: are you ready to find your own duplicates?
Five questions, takes two minutes. We'll show you the right answer and a one-line explanation after each one.
Duplicate content - quick check
5 randomized questions drawn from a pool of 12. Different every time you take it. Takes about two minutes.
Next up in Keyword Research & Content
Once your existing content is consolidated and not competing with itself, the next move is making sure new content doesn't recreate the same problem. From here:
- Content briefs that rank - how to write briefs that produce ranking content (without recreating cannibalization).
- Tracking and analytics setup - the verification checklist that closes the SEO measurement loop.