How do duplicate pages usually get into a PDF?

Common causes — paper jammed mid-scan and you re-scanned without removing the partial result; combining multiple "starts" of the same document; PDF merge tools that include an overlap; OCR re-runs that appended instead of replaced. Any time a PDF was assembled from multiple passes, duplicates are likely.

Will the tool auto-detect duplicates?

No, the current version is manual visual review — you spot them in the thumbnail grid. We may add a duplicate-detection pass in a future version, but the manual flow tends to be safer because near-duplicates (same content, different annotations or scan quality) are often things you actually want to keep.

My PDF is 200 pages. Can I really scan all the thumbnails?

Yes — thumbnails render progressively, and you can scroll while they fill in. For very long documents you can also use the Split PDF tool first to break it into sections and clean each section separately. Often duplicates cluster in one region of the document.

Is the file uploaded?

No. Page previews are rendered by PDF.js in your browser, and the page-removal step is done by pdf-lib, also locally. The cleaned PDF is built in memory and offered as a download — no upload, ever. Confirm in the Network tab.

How to delete duplicate pages from a PDF

Scanners, OCR runs, and combined-PDF workflows often produce duplicate pages. Here is how to spot and remove them — visually, in your browser, with no upload.

Duplicate pages in a PDF are the kind of mess that creeps in quietly — a scanner that jammed, an OCR run that appended instead of replaced, a combine-PDFs workflow with overlap. They make the file bigger, the table of contents wrong, and the recipient’s reading experience worse. The fix is a visual sweep: spot them in a thumbnail grid and delete them, in your browser.

Why duplicates happen

Scanner restarts. Paper jam mid-scan, you put the page back and re-scanned, but the partial result was already saved. Now you have one-and-a-half copies of that page.
OCR re-runs. OCR tools that “add searchable text” sometimes append a duplicate copy of the entire document instead of replacing it. Easy to miss until you scroll.
Combine-PDFs overlap. You merged a packet from two people and they each included the cover sheet.
Browser print-to-PDF “every page twice” bug — older Chromes had a notorious one.
Stitched-together emails that included the same attachment in the chain multiple times.

The clean-up flow

Open the Delete Pages tool.
Drop the PDF in. Every page renders as a thumbnail in a grid.
Scan the grid. Duplicates jump out — two thumbnails next to each other that look identical.
Click each duplicate to mark it for deletion. It dims with an “X” overlay so you can see what is going.
Click “Delete N pages” and download.

The cleaned PDF preserves the original page order minus the deleted ones. Text, fonts, vectors, and other pages are unchanged.

Spotting near-duplicates

Some “duplicates” are not quite identical:

Same content, different scan quality — a page that was re-scanned because the first was crooked. Keep the better one.
Same form, different fill-in — early and late copies of the same form during a review process. Keep the final.
Same image, different annotations — a scanned page with handwriting on one copy and not the other. Decide which to keep based on whether the annotation matters.

In all three cases, click only the version you do not want to keep.

Long documents

For 200+ page PDFs, the thumbnail grid renders progressively. Two options:

Just scroll — the grid renders as you scroll, so most users can spot-check a long document in a couple of minutes.
Split first — use the Split PDF tool to break the document into smaller sections, deduplicate each section, then re-merge with the Merge PDF tool. Slower in steps but easier to focus.

Privacy

Sweeping for duplicates in a sensitive scan (medical, legal, financial) is exactly the kind of task that should not involve uploading the file to a stranger’s server. PDF.js renders the previews and pdf-lib does the deletions, both in your browser. Confirm zero uploads in the Network tab.

How to delete duplicate pages from a PDF

Why duplicates happen

The clean-up flow

Spotting near-duplicates

Long documents

Privacy

Step by step

FAQs

Related guides