苏老的学习笔记-mysql查询去掉重复结果及删除重复记录保存id最小的一条的方法

mysql查询去掉重复结果及删除重复记录保存id最小的一条的方法

作者：shevechco 日期：2017-07-12 分类：RDS笔记浏览：1038次评论：0条

在使用MySQL时，有时需要查询出某个字段不重复的记录，虽然mysql提供有distinct这个关键字来过滤掉多余的重复记录只保留一条，但往往只用它来返回不重复记录的条数，而不是用它来返回不重记录的所有值。其原因是distinct只能返回它的目标字段，而无法返回其它字段，这个问题让我困扰了很久，用distinct不能解决的话，我只有用二重循环查询来解决，而这样对于一个数据量非常大的站来说，无疑是会直接影响到效率的。

下面先来看看例子：

table
id name
1 a
2 b
3 c
4 c
5 b

库结构大概这样，这只是一个简单的例子，实际情况会复杂得多。

比如我想用一条语句查询得到name不重复的所有数据，那就必须使用distinct去掉多余的重复记录。

select distinct name from table
得到的结果是:
name
a
b
c

好像达到效果了，可是，我想要得到的是id值呢？改一下查询语句吧:

select distinct name, id from table
结果会是:
  id name
  1 a
  2 b
  3 c
  4 c
  5 b

distinct怎么没起作用？作用是起了的，不过他同时作用了两个字段，也就是必须得id与name都相同的才会被排除。。。。。。。

我们再改改查询语句:

select id, distinct name from table

很遗憾，除了错误信息你什么也得不到，distinct必须放在开头。难到不能把distinct放到where条件里？能，照样报错。。。。。。。

最后的到了是这样的方法

select *, count(distinct name) from table group by name
结果:
id name count(distinct name)
1 a 1
2 b 1
3 c 1

更郁闷的事情发生了，在准备提交时容容发现，有更简单的解决方法。。。。。。

select id, name from table group by name

方法1：
1、创建一个临时表，选取需要的数据。
2、清空原表。
3、临时表数据导入到原表。
4、删除临时表。

mysql> select * from student;
+----+------+
| ID | NAME |
+----+------+
| 11 | aa |
| 12 | aa |
| 13 | bb |
| 14 | bb |
| 15 | bb |
| 16 | cc |
+----+------+
6 rows in set
mysql> create temporary table temp as select min(id),name from student group by name;
Query OK, 3 rows affected
Records: 3 Duplicates: 0 Warnings: 0
mysql> truncate table student;
Query OK, 0 rows affected
mysql> insert into student select * from temp;
Query OK, 3 rows affected
Records: 3 Duplicates: 0 Warnings: 0
mysql> select * from student;
+----+------+
| ID | NAME |
+----+------+
| 11 | aa |
| 13 | bb |
| 16 | cc |
+----+------+
3 rows in set
mysql> drop temporary table temp;
Query OK, 0 rows affected

这个方法，显然存在效率问题。

方法2：按name分组，把最小的id保存到临时表，删除id不在最小id集合的记录，如下：

mysql> create temporary table temp as select min(id) as MINID from student group by name;
Query OK, 3 rows affected
Records: 3 Duplicates: 0 Warnings: 0
mysql> delete from student where id not in (select minid from temp);
Query OK, 3 rows affected
mysql> select * from student;
+----+------+
| ID | NAME |
+----+------+
| 11 | aa |
| 13 | bb |
| 16 | cc |
+----+------+
3 rows in set

方法3：直接在原表上操作，容易想到的sql语句如下：

mysql> delete from student where id not in (select min(id) from student group by name);

执行报错：1093 - You can't specify target table 'student' for update in FROM clause
原因是：更新数据时使用了查询，而查询的数据又做了更新的条件，mysql不支持这种方式。
怎么规避这个问题？
再加一层封装，如下：

mysql> delete from student where id not in (select minid from (select min(id) as minid from student group by name) b);
Query OK, 3 rows affected
mysql> select * from student;
+----+------+
| ID | NAME |
+----+------+
| 11 | aa |
| 13 | bb |
| 16 | cc |
+----+------+
3 rows in set

mysql 去重

转载注明出处：https://sulao.cn/post/412.html

mysql查询去掉重复结果及删除重复记录保存id最小的一条的方法

相关文章

我要评论