Skip to main content

How To Find Duplicate Values in MySQL

 Summary: in this tutorial, you will learn how to find duplicate values of one or more columns in MySQL.

Data duplication happens because of many reasons. Finding duplicate values is one of the important tasks that you must deal with when working with the databases.

Setting up a sample table

First, create a table named contacts with four columns: idfirst_namelast_name, and email.

CREATE TABLE contacts ( id INT PRIMARY KEY AUTO_INCREMENT, first_name VARCHAR(50) NOT NULL, last_name VARCHAR(50) NOT NULL, email VARCHAR(255) NOT NULL );
Code language: SQL (Structured Query Language) (sql)

Second, inserts rows into the contacts table:

INSERT INTO contacts (first_name,last_name,email) VALUES ('Carine ','Schmitt','carine.schmitt@verizon.net'), ('Jean','King','jean.king@me.com'), ('Peter','Ferguson','peter.ferguson@google.com'), ('Janine ','Labrune','janine.labrune@aol.com'), ('Jonas ','Bergulfsen','jonas.bergulfsen@mac.com'), ('Janine ','Labrune','janine.labrune@aol.com'), ('Susan','Nelson','susan.nelson@comcast.net'), ('Zbyszek ','Piestrzeniewicz','zbyszek.piestrzeniewicz@att.net'), ('Roland','Keitel','roland.keitel@yahoo.com'), ('Julie','Murphy','julie.murphy@yahoo.com'), ('Kwai','Lee','kwai.lee@google.com'), ('Jean','King','jean.king@me.com'), ('Susan','Nelson','susan.nelson@comcast.net'), ('Roland','Keitel','roland.keitel@yahoo.com');
Code language: SQL (Structured Query Language) (sql)

Third, query data from the contacts table:

SELECT * FROM contacts ORDER BY email;
Code language: SQL (Structured Query Language) (sql)

In the contacts table, we have some rows that have duplicate values in the first_namelast_name, and email columns. Let’s learn how to find them.

Find duplicate values in one column

The find duplicate values in on one column of a table, you use follow these steps:

  1. First, use the GROUP BY clause to group all rows by the target column, which is the column that you want to check duplicate.
  2. Then, use the COUNT() function in the HAVING clause to check if any group have more than 1 element. These groups are duplicate.

The following query illustrates the idea:

SELECT col, COUNT(col) FROM table_name GROUP BY col HAVING COUNT(col) > 1;
Code language: SQL (Structured Query Language) (sql)

By using this query template, you can to find rows that have duplicate emails in the contacts table as follows:

SELECT email, COUNT(email) FROM contacts GROUP BY email HAVING COUNT(email) > 1;
Code language: SQL (Structured Query Language) (sql)

This picture shows the output of the query that shows the duplicate emails:

MySQL find duplicate values example

Find duplicate values in multiple columns

Sometimes, you want to find duplicate rows based on multiple columns instead of one. In this case, you can use the following query:

SELECT col1, COUNT(col1), col2, COUNT(col2), ... FROM table_name GROUP BY col1, col2, ... HAVING (COUNT(col1) > 1) AND (COUNT(col2) > 1) AND ...
Code language: SQL (Structured Query Language) (sql)

Rows are considered duplicate only when the combination of columns are duplicate therefore we used the AND operator in the HAVING clause.

For example, to find rows in the contacts table with duplicate values in first_namelast_name, and email column, you use the following query:

SELECT first_name, COUNT(first_name), last_name, COUNT(last_name), email, COUNT(email) FROM contacts GROUP BY first_name , last_name , email HAVING COUNT(first_name) > 1 AND COUNT(last_name) > 1 AND COUNT(email) > 1;
Code language: SQL (Structured Query Language) (sql)

The following illustrates the output of the query:

MySQL find duplicate values on multiple columns

In this tutorial, you have learned how to find duplicate rows based on value of one or more columns in MySQL.

Comments

Popular Posts

How to find out Max Salary from each department

You can find maximum salary for each department by grouping all records by DeptId and then using MAX() function to calculate maximum salary in each group or each department. SQL Query: SELECT DeptID, MAX(Salary) FROM Employee  GROUP BY DeptID. This questions become more interesting if Interviewer will ask you to print department name instead of department id, in that case you need to join Employee table with Department using foreign key DeptID, make sure you do LEFT or RIGHT OUTER JOIN to include departments without any employee as well.  Here is the query