本文通过示例进行实战学习,如何有效查找上月状态没有成功的记录。
需求说明
假设有过程在后台运行并在数据库中记录状态。现在需要查询上月状态从未成功的记录。
insert into exp_table values
(1 ,'2021-06-15' ,'FAILED'), -- <- PID 1 was successful, but in last month
(1 ,'2021-06-05' ,'FAILED'),
(1 ,'2021-06-01' ,'FAILED'),
(1 ,'2021-05-01' ,'SUCCESSFUL' ),
(2 ,'2021-06-15' ,'SUCCESSFUL'), -- <- PID 2 was successful failed within month
(2 ,'2021-06-05' ,'FAILED'),
(2 ,'2021-06-01' ,'SUCCESSFUL'),
(3 ,'2021-06-15' ,'FAILED' ), -- <- PID 3 only fails
(3 ,'2021-06-05' ,'FAILED'),
(3 ,'2021-06-01' , 'FAILED'),
(4 ,'2021-06-15' , 'SUCCESSFUL'), -- <- PID 4 only successful
(4 ,'2021-06-05' , 'SUCCESSFUL'),
(4 ,'2021-06-01' , 'SUCCESSFUL')
给定条件:
- 假设当前时间为 2021-06-16
- 查询上月只有失败状态的记录( 2021-05-16 ~ 2021-06-16 )
期望结果:
- PID 为1 和 PID 为3 的任务符合条件
实现说明
首先尝试获得每个PID的状态统计量,然后再通过HAVING过滤失败数量为1的记录,但结果不正确:
WITH dataset_by_status AS
(
SELECT pid, status, Count(*) AS counter
FROM exp_table
WHERE (status = 'FAILED' OR status = 'SUCCESSFUL')
AND exe_date >= ('2021-06-16'::date - interval '1 MONTH')
GROUP BY pid,status
ORDER BY pid
)
SELECT pid,
count(*) AS counter
FROM dataset_by_status
WHERE status = 'FAILED'
GROUP BY pid
HAVING count(*) = 1;
因为上面仅过滤失败状态,但PID 为 2 的记录也有成功的记录。下面看如何解决。
利用 fliter 子句
在count 函数中使用 filter子句实现:
select pid
from exp_table
group by pid
having count(*) filter (where status = 'SUCCESSFUL'
and exe_date >= '2021-06-16'::date - interval '1 month') = 0;
查询结果符合预期,但我们可以日期过滤放在where 条件的后面,提升查询性能:
select pid, count(*) filter (where status = 'FAILED') as failed
from exp_table
where exe_date >= '2021-06-16'::date - interval '1 month'
group by pid
having count(*) filter (where status = 'SUCCESSFUL') = 0;
总结
本文综合利用 with、日期运算、filter表达式,查询上月未成功运行的任务。
本文参考链接:https://blog.csdn.net/neweastsun/article/details/119141185