markdown--- title: "Finding & Removing Duplicates" description: "Complete guide to SheetClean's duplicate detection and removal features including advanced options, column selection, and best practices" order: 1 lastUpdated: "2025-01-20" readingTime: 8 keywords: ["find duplicates", "remove duplicates", "quick dedupe", "duplicate detection", "column comparison"]

Finding & Removing Duplicates

Master SheetClean's powerful duplicate detection and removal features to keep your data clean and organized.

Overview

SheetClean offers two main approaches to handling duplicates:

🔍 Find Duplicates - Advanced detection with preview and multiple actions

  • Choose exactly which columns to compare
  • Preview results before taking action
  • Multiple action options (highlight, delete, move, etc.)
  • Find duplicates, unique values, or first occurrences

⚡ Quick Dedupe - Fast one-click duplicate removal

  • Automatic settings detection
  • Instant removal with backup
  • Perfect for standard duplicate cleanup
  • Ideal for recurring tasks

Find Duplicates: Step-by-Step

The Find Duplicates feature gives you complete control over duplicate detection.

Step 1: Select Your Data Range

  1. Open SheetClean sidebar
  2. Under 🔍 Find, click Find Duplicates
  3. Your current selection is displayed at the top

Quick actions:

  • Click the 📄 Select all data icon to automatically select all data
  • Click 🔄 Sync range to update the displayed selection

The info bar shows: Sheet1 • 100 rows × 4 columns (A1:D100)

Step 2: Choose Columns to Compare

Select which columns should be checked for duplicates.

Column Selection Interface:

Each column is listed with:

  • ☐ Checkbox to include/exclude
  • Column letter (A, B, C, etc.)
  • Header name (if detected) or "Column A"

Selection Tips:

Compare all columns - Check all boxes

  • Use when: You want exact duplicates where every field matches
  • Example: Finding completely identical customer records

Compare specific columns - Check only key columns

  • Use when: You want to find business duplicates
  • Example: Check only "Email" to find duplicate customers regardless of name changes

Compare multiple identifiers - Check 2-3 key columns

  • Use when: You need more accuracy than single-column comparison
  • Example: Check "First Name" + "Last Name" + "Email" for customer duplicates

Step 3: Configure Options

Three important options at the top:

☐ My table has headers

  • Check this if row 1 contains column names
  • SheetClean will skip row 1 during comparison
  • Column names appear in the selection list

☑ Skip empty rows (Recommended)

  • Ignores completely blank rows
  • Prevents false duplicate detection
  • Enabled by default

☐ Match case

  • When enabled: "Apple" and "apple" are different
  • When disabled: "Apple" and "apple" are the same
  • Disabled by default for most use cases

Step 4: Choose What to Find

Select what type of results you want:

● Find duplicates (Default)

  • Shows all rows that appear more than once
  • Includes ALL occurrences (first + duplicates)
  • Most common choice

○ Find unique values

  • Shows rows that appear only once
  • Opposite of duplicates
  • Useful for finding one-time customers, unique transactions, etc.

○ Find first occurrences

  • Shows only the first instance of each duplicate group
  • Excludes later duplicates
  • Useful for creating a clean master list

Step 5: Select Actions

Choose what to do with the found results. You can select multiple actions:

☐ Fill with color

  • Highlights found rows with your chosen color
  • Click [Select Color] to choose from red, orange, yellow, green, blue, purple, or pink
  • Default: Light red for duplicates

☐ Add status column

  • Inserts a new column with status labels
  • Labels: "Duplicate", "Unique", or "First Occurrence"
  • Makes results easy to filter and sort

☐ Copy to another location

  • Copies found rows to a different location
  • Specify destination (e.g., F1 or Sheet2!A1)
  • Leaves original rows unchanged

☐ Move to another location

  • Moves found rows to a different location
  • Removes rows from original location
  • Specify destination

☐ Clear values

  • Removes content from found rows
  • Keeps rows but empties all cells
  • Use with caution

☐ Delete entire rows

  • Permanently removes found rows
  • Creates backup automatically
  • Cannot be undone without backup

Action Combinations:

Common combinations:

  • Highlight only: ☑ Fill with color
  • Highlight and label: ☑ Fill with color + ☑ Add status column
  • Remove duplicates: ☑ Delete entire rows (finds duplicates, deletes them)
  • Separate duplicates: ☑ Move to another location (moves to "Duplicates" sheet)

Step 6: Review and Process

Click Next to see a summary of your settings:

Review screen shows:

  • Range: A1:C100
  • Columns: A, B (which columns are being compared)
  • Mode: Find duplicates
  • Actions: Fill with color, Add status column
  • Options: Skip empty rows: Yes, Match case: No, Has headers: No

Click Process to run the duplicate check.

Understanding Results

After processing, you'll see a results summary:

Found Statistics:

  • Total rows scanned: 100
  • Unique groups: 7 (number of distinct value combinations)
  • Duplicates found: 14 (rows that match other rows)
  • Processing time: 0.8s

Actions taken:

  • ✓ Highlighted with color
  • ✓ Added status column

Save This Configuration:

Found the perfect settings? Save them for reuse:

  1. Enter a name: "Daily Cleanup"
  2. Click Save
  3. Access saved configurations from 🤖 Automations

Quick Dedupe: Fast Duplicate Removal

For standard duplicate removal, Quick Dedupe is the fastest option.

How It Works

Quick Dedupe automatically:

  1. Detects your data structure (headers, range, columns)
  2. Selects optimal comparison settings
  3. Creates a backup before removal
  4. Removes duplicates instantly
  5. Shows a summary of what was removed

Using Quick Dedupe

  1. Open SheetClean sidebar
  2. Under 🗑️ Remove, click Remove Duplicates
  3. Review auto-detected settings
  4. Choose what to keep (first, last, or remove all)
  5. Click Quick Remove Duplicates

Auto-Detected Settings Display

SheetClean shows what it detected:

Range: A1:D50 (the data being processed) Headers: Yes (Row 1) or No headers Columns: All columns (what's being compared) Total rows: 50 (how many rows will be checked)

These settings are automatically optimized based on your data.

Removal Strategies

Choose what to do with duplicates:

● Keep first occurrence (Recommended)

  • Keeps the first row in each duplicate group
  • Removes all later duplicates
  • Most common choice for data cleanup

○ Keep last occurrence

  • Keeps the last/newest row in each duplicate group
  • Removes earlier duplicates
  • Use when latest data is most accurate

○ Remove all occurrences

  • Removes ALL rows in duplicate groups
  • Keeps only truly unique rows
  • Use carefully - this removes even the first occurrence

Example:

If you have: Row 1: John, Smith, [email protected] Row 2: John, Smith, [email protected] (duplicate) Row 3: Jane, Doe, [email protected]

  • Keep first: Keeps Row 1 and 3, removes Row 2
  • Keep last: Keeps Row 2 and 3, removes Row 1
  • Remove all: Keeps only Row 3, removes Row 1 and 2

Options Explained

☑ Create backup before removal (Highly recommended)

  • Automatically saves a copy of your data before changes
  • Stored as a hidden sheet named "Backup_[SheetName]_[timestamp]"
  • Allows you to undo if needed

☑ Show summary after completion

  • Displays results screen after processing
  • Shows rows removed and rows kept
  • Can be disabled for faster batch operations

☑ Skip empty rows

  • Ignores completely blank rows
  • Prevents empty rows from being counted as duplicates
  • Recommended to keep enabled

☐ Case sensitive comparison

  • Disabled: "Apple" = "apple" (same)
  • Enabled: "Apple" ≠ "apple" (different)
  • Enable only if capitalization is important in your data

Quick Dedupe Results

After removal, you see:

Success Screen:

  • ✓ Deduplication Complete!
  • 14 Rows Removed - Number of duplicates deleted
  • 86 Rows Kept - Number of unique rows remaining

Backup Information:

  • 💾 Backup created: [timestamp]
  • 👁️ View Backup button - Makes backup sheet visible
  • 📋 See All Backups → button - Opens backup manager

Click Remove More Duplicates to run another check.

Advanced Techniques

Comparing Specific Column Combinations

Email-based deduplication:

  • Compare only: Email column
  • Result: Finds customers with same email, keeps one record per email

Name-based with fuzzy matching:

  • Compare: First Name + Last Name
  • Enable: Case sensitive OFF
  • Result: Finds "John Smith" and "john smith" as duplicates

Multi-field business keys:

  • Compare: Customer ID + Order Date + Product
  • Result: Finds duplicate orders with exact same details

Handling Partial Duplicates

Scenario: Some columns match, others differ

Example data: Row 1: John Smith, [email protected], 123 Main St Row 2: John Smith, [email protected], 456 Oak Ave

Solution 1: Find by key fields

  • Compare only: Name + Email
  • Finds both rows as duplicates (address difference ignored)

Solution 2: Compare all fields

  • Compare: All columns
  • Finds them as unique (address is different)

Working with Large Datasets

Performance tips for 10,000+ rows:

  1. Use Quick Dedupe - Faster than Find Duplicates for simple cases
  2. Compare fewer columns - Comparing 2 columns is faster than 10
  3. Process in batches - Split very large sheets (100K+ rows) into sections
  4. Schedule during off-hours - Use automation for large nightly cleanups

Processing speeds:

  • 1,000 rows: ~1 second
  • 10,000 rows: ~5 seconds
  • 50,000 rows: ~20 seconds
  • 100,000 rows: ~45 seconds

Best Practices

Before Processing

✓ Always create backups - Enable "Create backup before removal"

✓ Test on a sample first - Try on the first 100 rows to verify settings

✓ Verify column selection - Double-check which columns you're comparing

✓ Check header detection - Ensure "Has headers" is set correctly

Choosing the Right Feature

Use Quick Dedupe when:

  • You want standard duplicate removal
  • All columns should be compared
  • You need fast results
  • You're doing routine cleanup

Use Find Duplicates when:

  • You need to compare specific columns only
  • You want to see results before removing
  • You need multiple actions (highlight + label)
  • You want to find unique values or first occurrences

After Processing

✓ Verify results - Scroll through data to confirm correct rows were kept

✓ Check the count - If more rows removed than expected, restore from backup

✓ Review highlighted rows - When using "Fill with color", check the highlighted data

✓ Test formulas - If your sheet has formulas, ensure they still work correctly

Common Use Cases

Customer Database Cleanup

Goal: Remove duplicate customer records

Method: Find Duplicates

  • Compare columns: Email
  • Actions: Fill with color + Add status column
  • Review before deleting

Daily Sales Data Deduplication

Goal: Remove duplicate sales entries uploaded daily

Method: Quick Dedupe

  • Keep first occurrence
  • Create automation for daily run
  • Review summary each morning

Email List Cleaning

Goal: Remove duplicate email addresses for marketing

Method: Find Duplicates

  • Compare columns: Email only (ignore name changes)
  • Action: Delete entire rows
  • Keep first occurrence

Invoice Duplicate Prevention

Goal: Find duplicate invoices by invoice number

Method: Find Duplicates

  • Compare columns: Invoice Number + Date
  • Action: Highlight with color
  • Manual review before deletion

Troubleshooting

"No duplicates found" but I see duplicates

Possible causes:

  1. Wrong columns selected - You're comparing columns that don't match

    • Solution: Check column selection, compare only key identifier columns
  2. Extra spaces or characters - "John " and "John" are different

    • Solution: Clean data first with TRIM function or text cleanup tools
  3. Case sensitivity enabled - "John" and "john" are different

    • Solution: Disable "Match case" option
  4. Date/time format differences - Dates may look the same but have different times

    • Solution: Compare only the date column, not datetime

Too many rows removed

Solution:

  1. Immediately restore from backup - Click "See All Backups" → Restore
  2. Review column selection - Were you comparing the right columns?
  3. Check "Keep first" vs "Remove all" - Make sure you selected the right strategy
  4. Test on a copy - Duplicate your sheet and test there first

Backup not created

Possible causes:

  1. Option was disabled - "Create backup before removal" was unchecked

    • Solution: Always keep this enabled
  2. Permission issues - SheetClean couldn't create hidden sheet

    • Solution: Check sheet permissions, try again
  3. Sheet name conflict - A backup sheet with same name already exists

    • Solution: Delete or rename old backups first

Next Steps

Now that you understand duplicate detection:

  1. Set up automation - Automate recurring duplicate checks
  2. Compare features - Learn to compare sheets and columns
  3. Merge duplicates - Combine duplicate rows instead of deleting

Need Help?


Pro tip: Save your most-used duplicate check configurations as automations to reuse them with one click.