Office Letter: Removing Dupes in Word
Here's a clever search-and-replace string to remove duplicate paragraphs from a Word document.
Bill Coan, the brains behind one of our favorite utilities (DataPrompter), wrote with this enhancement to a previous reader tip about deleting duplicate entries in a Word list:
Claudio Faria offered a great tip on removing duplicates in Word. As you'll recall, she suggested using Word's Edit>Replace command and then placing a checkmark next to Use Wildcards. With wildcards enabled, Claudio recommended entering the following values into the Replace dialog box:
Find What: ([!^13]@)^13\1^13 Replace With: \1^p
The only drawback to this approach is that you have to click Replace repeatedly until Word has found and deleted all duplicate paragraphs.
I'd like to offer a small refinement to Claudio's technique, which makes it possible to find and delete all duplicate paragraphs with a single click. As with Claudio's technique, you start by sorting your document so that identical paragraphs are situated contiguously. Then you choose Edit>Replace and place a checkmark next to Use Wildcards. Then you enter the following values into the Replace dialog box:
Find What: ([!^13]@)^13(\1^13)@ Replace With: \2 When you click Replace, Word finds and deletes all duplicate paragraphs. For a full explanation of the wildcard search terms, please review Claudio's excellent summary in the last issue. What follows is a condensed explanation of why this works:
Find What: ([!^13]@)^13
The above expression means: "Find one or more non-paragraph-marks followed by a paragraph mark" (i.e., find a complete paragraph).
Find What: ([!^13]@)^13(\1^13)
The above expression means: "Find one or more non-paragraph-marks followed by a paragraph mark followed by an identical series of non-paragraph-marks followed by a paragraph mark" (i.e., find a complete paragraph followed by an identical complete paragraph).
Find What: ([!^13]@)^13(\1^13)@
The above expression means: "Find one or more non-paragraph-marks followed by a paragraph mark followed by one or more identical series of non-paragraph-marks followed by a paragraph mark" (i.e., find a complete paragraph followed by one or more identical complete paragraphs).
Replace With: \2
The above expression means: "Replace the found text with the second parenthetical expression in the Find What text" (i.e., Replace the found text with a single copy of the complete paragraph).
-- Bill Coan, www.wordsite.com
Editor's note: Don't miss our review of DataPrompter. Thanks for your insight, Bill. If you have a tip to share with Office Letter readers, please send it to [email protected].
The Office Letter is a weekly e-mail and online newsletter offering tips, tricks, and techniques for Microsoft Office. It offers shortcuts, explores features, and boosts productivity with hands-on how-to information for Word, Excel, Outlook, PowerPoint, and more.
About the Author
You May Also Like
2024 InformationWeek US IT Salary Report
Aug 15, 2024Managing Third-Party Risk Through Situational Awareness
Jul 31, 20242024 InformationWeek US IT Salary Report
May 29, 20242022 State of ITOps and SecOps
Jun 21, 2022