How to handle duplicate data in MySql
This is very common problem in huge database to handle the duplicate data.
Tables or result sets sometimes contain duplicate records. Sometimes, it is allowed but sometimes it is required to stop duplicate records. Sometimes, it is required to identify duplicate records and remove them from the table. This chapter will describe how to prevent duplicate records occurring in a table and how to remove already existing duplicate records.
Check Duplicate data in table
SELECT `col_name`, COUNT(*) as `record_find` FROM table GROUP BY col_name HAVING `record_find` > 1;
Above query result will show the column name and number of time of record find in database table.
Delete all duplicate records
1) If you want to keep the row with the lowest id value:
DELETE n1 FROM names n1, names n2 WHERE n1.id > n2.id AND n1.name = n2.name
2) If you want to keep the row with the highest id value:
DELETE n1 FROM names n1, names n2 WHERE n1.id < n2.id AND n1.name = n2.name
I used this method in MySQL 5.1
Not sure about other versions.
And one more important thing – You need to do this first on a test copy of your table! 😉