How to delete duplicate pages from a PDF
Scanners, OCR runs, and combined-PDF workflows often produce duplicate pages. Here is how to spot and remove them — visually, in your browser, with no upload.
Open the Delete Pages tool →Duplicate pages in a PDF are the kind of mess that creeps in quietly — a scanner that jammed, an OCR run that appended instead of replaced, a combine-PDFs workflow with overlap. They make the file bigger, the table of contents wrong, and the recipient’s reading experience worse. The fix is a visual sweep: spot them in a thumbnail grid and delete them, in your browser.
Why duplicates happen
- Scanner restarts. Paper jam mid-scan, you put the page back and re-scanned, but the partial result was already saved. Now you have one-and-a-half copies of that page.
- OCR re-runs. OCR tools that “add searchable text” sometimes append a duplicate copy of the entire document instead of replacing it. Easy to miss until you scroll.
- Combine-PDFs overlap. You merged a packet from two people and they each included the cover sheet.
- Browser print-to-PDF “every page twice” bug — older Chromes had a notorious one.
- Stitched-together emails that included the same attachment in the chain multiple times.
The clean-up flow
- Open the Delete Pages tool.
- Drop the PDF in. Every page renders as a thumbnail in a grid.
- Scan the grid. Duplicates jump out — two thumbnails next to each other that look identical.
- Click each duplicate to mark it for deletion. It dims with an “X” overlay so you can see what is going.
- Click “Delete N pages” and download.
The cleaned PDF preserves the original page order minus the deleted ones. Text, fonts, vectors, and other pages are unchanged.
Spotting near-duplicates
Some “duplicates” are not quite identical:
- Same content, different scan quality — a page that was re-scanned because the first was crooked. Keep the better one.
- Same form, different fill-in — early and late copies of the same form during a review process. Keep the final.
- Same image, different annotations — a scanned page with handwriting on one copy and not the other. Decide which to keep based on whether the annotation matters.
In all three cases, click only the version you do not want to keep.
Long documents
For 200+ page PDFs, the thumbnail grid renders progressively. Two options:
- Just scroll — the grid renders as you scroll, so most users can spot-check a long document in a couple of minutes.
- Split first — use the Split PDF tool to break the document into smaller sections, deduplicate each section, then re-merge with the Merge PDF tool. Slower in steps but easier to focus.
Privacy
Sweeping for duplicates in a sensitive scan (medical, legal, financial) is exactly the kind of task that should not involve uploading the file to a stranger’s server. PDF.js renders the previews and pdf-lib does the deletions, both in your browser. Confirm zero uploads in the Network tab.
Step by step
- Open the Delete Pages tool.
- Drop the PDF on the upload area — every page renders as a thumbnail.
- Scan visually for duplicates (they look identical side-by-side in the grid).
- Click each duplicate to mark it for deletion (it dims with an "X").
- Click "Delete N pages" and download the cleaned PDF.
FAQs
- How do duplicate pages usually get into a PDF?
- Common causes — paper jammed mid-scan and you re-scanned without removing the partial result; combining multiple "starts" of the same document; PDF merge tools that include an overlap; OCR re-runs that appended instead of replaced. Any time a PDF was assembled from multiple passes, duplicates are likely.
- Will the tool auto-detect duplicates?
- No, the current version is manual visual review — you spot them in the thumbnail grid. We may add a duplicate-detection pass in a future version, but the manual flow tends to be safer because near-duplicates (same content, different annotations or scan quality) are often things you actually want to keep.
- My PDF is 200 pages. Can I really scan all the thumbnails?
- Yes — thumbnails render progressively, and you can scroll while they fill in. For very long documents you can also use the Split PDF tool first to break it into sections and clean each section separately. Often duplicates cluster in one region of the document.
- Is the file uploaded?
- No. Page previews are rendered by PDF.js in your browser, and the page-removal step is done by pdf-lib, also locally. The cleaned PDF is built in memory and offered as a download — no upload, ever. Confirm in the Network tab.
Related guides
- Remove blank pages Sheet-fed scans almost always include empty pages from blank backs and separator sheets. Here's how to remove them cleanly — in your browser, no upload.
- Extract pages from a PDF Sometimes you only need a few pages from a long PDF — a single contract clause, two pages from a long report. Here's how to pull them out, in your browser.
- Compress a scanned PDF Scanned PDFs are usually mostly image — and image PDFs shrink dramatically. Here's how to compress a scanned PDF in your browser without sacrificing legibility.