表名和表结构数据:users

id unionid
1 111111
2 222222
3 222222
4 111111
5 111111
6 111111
7 111111
8 222222
9 111111
10 111111
11 111111
12 333333
13 111111

方案一

删除 unionid 字段重复的数据,并保留 ID 最小的

DELETE FROM users
WHERE id NOT IN (
 SELECT u.min_id
 FROM (SELECT MIN(id) AS min_id FROM users GROUP BY unionid) AS u
);

删除后的结果

id unionid
1 111111
2 222222
12 333333

实际测试说明:本机测试 表中数据条数:417936 重复的 unionid 条数:2188 重复的 unionid 条数总数:19179(14796x1 + 4x2 + 3x5 + 2180x2) 最终删除数据:16991 执行时间:1′12″6

方案二

表名和表结构数据:users

id unionid
1 444444
2 222222
12 333333
14 111111
15 111111
16 333333
17 444444

执行删除命令

DELETE u1
FROM
	users AS u1,users AS u2
WHERE u1.unionid=u2.unionid AND u1.id > u2.id;

删除后的数据

id unionid
1 444444
2 222222
12 333333
14 111111

1、找出想保留数据的id,这一步是最关键的一步

# id 是自增的情况,很简单
SELECT MIN(id) AS min_id FROM users GROUP BY unionid;

# id 不是自增,如 UUID 时,就需要根据插入时间来排序了,别告诉我你这个也没有
SELECT min(created_at) time,openid FROM channel GROUP BY openid;

2、把其他的数据删除

附录:其他操作,随便说一下

  • 查询重复数据(多字段)
select * from miniapp_user_activity_infos_temp a where (a.openid,a.map_id,a.created_at) IN
(
 select openid,map_id,created_at from miniapp_user_activity_infos_temp
 group by openid,map_id,created_at having count(*) > 1
 )
  • 查出数据(保留ID最小的一条)
select * from channel where id IN (select min(id) from channel GROUP BY openid);
  • 看有多少条数据
select count(1) from channel where id In (select min(id) from channel GROUP BY openid);

附录:https://www.jb51.net/article/116677.htm