Office Letter: Removing Dupes in Word

Here's a clever search-and-replace string to remove duplicate paragraphs from a Word document.

InformationWeek Staff, Contributor

October 26, 2005

2 Min Read
InformationWeek logo in a gray background | InformationWeek

Bill Coan, the brains behind one of our favorite utilities (DataPrompter), wrote with this enhancement to a previous reader tip about deleting duplicate entries in a Word list:

  • Claudio Faria offered a great tip on removing duplicates in Word. As you'll recall, she suggested using Word's Edit>Replace command and then placing a checkmark next to Use Wildcards. With wildcards enabled, Claudio recommended entering the following values into the Replace dialog box:

    Find What: ([!^13]@)^13\1^13 Replace With: \1^p

    The only drawback to this approach is that you have to click Replace repeatedly until Word has found and deleted all duplicate paragraphs.

    I'd like to offer a small refinement to Claudio's technique, which makes it possible to find and delete all duplicate paragraphs with a single click. As with Claudio's technique, you start by sorting your document so that identical paragraphs are situated contiguously. Then you choose Edit>Replace and place a checkmark next to Use Wildcards. Then you enter the following values into the Replace dialog box:

    Find What: ([!^13]@)^13(\1^13)@ Replace With: \2 When you click Replace, Word finds and deletes all duplicate paragraphs. For a full explanation of the wildcard search terms, please review Claudio's excellent summary in the last issue. What follows is a condensed explanation of why this works:

    Find What: ([!^13]@)^13

    The above expression means: "Find one or more non-paragraph-marks followed by a paragraph mark" (i.e., find a complete paragraph).

    Find What: ([!^13]@)^13(\1^13)

    The above expression means: "Find one or more non-paragraph-marks followed by a paragraph mark followed by an identical series of non-paragraph-marks followed by a paragraph mark" (i.e., find a complete paragraph followed by an identical complete paragraph).

    Find What: ([!^13]@)^13(\1^13)@

    The above expression means: "Find one or more non-paragraph-marks followed by a paragraph mark followed by one or more identical series of non-paragraph-marks followed by a paragraph mark" (i.e., find a complete paragraph followed by one or more identical complete paragraphs).

    Replace With: \2

    The above expression means: "Replace the found text with the second parenthetical expression in the Find What text" (i.e., Replace the found text with a single copy of the complete paragraph).

    -- Bill Coan, www.wordsite.com

Editor's note: Don't miss our review of DataPrompter. Thanks for your insight, Bill. If you have a tip to share with Office Letter readers, please send it to [email protected].

The Office Letter is a weekly e-mail and online newsletter offering tips, tricks, and techniques for Microsoft Office. It offers shortcuts, explores features, and boosts productivity with hands-on how-to information for Word, Excel, Outlook, PowerPoint, and more.

Never Miss a Beat: Get a snapshot of the issues affecting the IT industry straight to your inbox.

You May Also Like


More Insights