There are multiple ways to Delete duplicate data from A Table using SQL Query.Here we are going to talk about using Correlated subqueries to Delete duplicate data.
Let say we have a EMPLOY table as follow
The in above EMPLOYEE table multiple records are duplicate. To remove all duplicate records, we can write very simple correlated subquery.
Delete from EMPLOYEE E1 where RRN(E1) > (Select MIN(RRN(E2)) from EMPLOYEE E2 where E2.id = E1.id)
To delete duplicate records, we need to find a unique value for each record of table so we are using SQL RRN() function to get relative record number for each record.
To complete the execution of subquery “Select MIN(RRN(E2)) from EMPLOYEE E2 where E2.id = E1.id” SQL need the value of “E1.id” from outer query. So SQL will read the first record (RRN = 1) for outer query and get the value of E1.id = 1 and pass this value to subquery. Now the subquery will be executed as
Select MIN(RRN(E2)) from EMPLOYEE E2 where E2.id = 1
If RRN found in outer query is bigger than RRN found in subquery than SQL will delete that record from EMPLOYEE table, If RRN found in outer query is smaller or equal to RRN found in subquery than the system will skip to next record of outer query. This process will go on for each and every record of outer query. At last only one record for each id with minimum RRN will be remaining in table.
So Final output will be