答案:掌握去重后筛选需理解执行顺序,优先用HAVING处理聚合条件,子查询和窗口函数应对复杂场景。
在MySQL中,去重和筛选是日常查询中最常见的操作。很多人在使用 DISTINCT 或 GROUP BY 去重后,发现无法灵活地进一步筛选数据,尤其是当需要基于去重后的结果做条件过滤时。其实只要理解执行顺序和合理使用子查询或 HAVING,就能高效实现目标。
DISTINCT 用于去除完全重复的行,但它不能直接配合 WHERE 对聚合结果做判断。若要去重后再筛选,推荐将去重结果作为子查询处理。
SELECT email FROM (SELECT email, COUNT(*) AS cnt FROM orders GROUP BY email) t WHERE cnt > 1;
GROUP BY 不仅能去重,还能配合聚合函数统计。HAVING 可以对聚合结果进行筛选,这是 WHERE 无法做到的。
SELECT user_id, COUNT(*) AS order_count FROM orders GROUP BY user_id
HAVING COUNT(*) >= 2;
对于需要保留去重后某条具体记录(如最新一条),可用 ROW_NUMBER() 等窗口函数。
SELECT * FROM (SELECT *, ROW_NUMBER() OVER (PARTITION BY user_id ORDER BY create_time DESC) AS rn FROM orders) t WHERE rn = 1 AND amount > 100;
基本上就这些常用技巧。关键是理解 SQL 执行顺序:WHERE → GROUP BY → 聚合函数 → HAVING → SELECT → DISTINCT → ORDER BY → LIMIT。合理利用子查询、HAVING 和窗口函数,就能轻松实现去重后的精准筛选。不复杂但容易忽略细节。